DOCUMENT  RESUME 


ED  288  887 


TM  870  616 


AUTHOR 
TITLE 

INSTITUTION 

SPONS  A6ENC7 
REPORT  NO 
PJB  DATE 
GRAS^ 
NOTE 


AVAir«ABLE  FROM 


PUB  TYPE 

EDRS  PRICE 
DESCRIPTORS 


IDENTIFIERS 


Beaton,  Albert  E, 

Implementing  the  New  Design:  The  NAEP  1983-84 
Technical  Report. 

National  Assessment  o£  Educational  Progress, 
Princeton,  NJ, 

Center  for  Statistics  (OERI/ED),  Washington,  DC. 
ISBN-0-88685-062-2;  NAEP-15-TR-20 
Mar  87 

NIE-G-83-0011 

813p.;  For  the  "Users'  Guide"  and  "Codebooks  and 
Layouts"  pertaining  to  the  NAEP  Public-Use  data  tapes 
for  1983-84,  see  TM  870  621-624. 
National  Assessment  of  Educational  Progress, 
Educational  Testing  Service,  Rosedale  Road,  CN  6710, 
Princeton,  NJ  08541-6710  ($25.00,  plus  $3.00 
postage)  • 

Reports  -  Research/Technical  (143) 
MF05/PC33  Plus  Postage. 

Academic  Achievement;  Data  Analysis;  Data  Collection; 
Data  Interpretation;  ^Educational  Assessment; 
Elementary  Secondary  Education;  *ltem  Sampling; 
Latent  Trait  Theory;  ^National  Competency  Tests; 
National  Surveys;  Program  Implementation;  Reading 
Tests;  ^Research  Design;  Sampling;  Scaling; 
Statistical  Analysis;  ^Testing  Programs;  Writing 
Evaluation 

^Balanced  Incomplete  Block  Spiralling;  ^National 
Assessment  of  Educational  Progress 


ABSTRACT 

In  1982,  the  Educational  Testing  Service  (ETS) 
proposed  to  implement  a  new,  complex  design  for  the  National 
Assessment  of  Educational  Progress  (NAEP).  The  major  features  of  this 
design  are  described  in  **A  New  Design  for  a  New  Era**  (Messick, 
Beaton,  and  Lord,  1983).  The  purpose  of  this  document  is  to  describe 
the  actual  implementation  of  the  design  in  the  1983-84  National 
Assessment  of  Reading  and  Writing  (NAEP's  fifteenth  year);  it  is 
intended  as  a  supplement  to  the  reports  of  that  assessment  (see  ED 
264  550,  ED  273  680,  ED  273  994)  and  supports  these  reports  by 
providing  detailed  technical  information  so  that  the  accuracy  of  the 
substantive  results  can  be  judged.  Some  major  features  of  the  new 
design  were:  to  sample  grades  4,  8,  and  11  as  well  as  students*  ages 
9,  13,  and  17  (in  school);  to  introduce  Balanced  Incomplete  Block 
(BIB)  spiralling  as  a  method  of  estimating  inter-relationships  among 
variables;  to  collect  extensive  information  about  teachers, 
principals,  and  schools;  and  to  scale  the  reading  data,  if  possible. 
These  innovations  were  added  to  the  previously  used  procedures,  which 
were  kept  to  ensure  maintenance  of  NAEP  trends.  This  report 
describes:  (1)  the  data  collection  processes,  including  the 
assessment  instruments  for  reading  and  writing;  (2)  the  data  analysis 
process  for  both  reading  and  writing,  including  "plausible  values**  of 
reading  proficiency  and  the  NAEP  reading  and  writing;  and  (3)  some 
estimates  of  the  reading  and  writing  proficiencies  of  selected 
subpopulatioDS  of  the  sampled  students.  Two  supplementary  studies  on 
the  validity  of*  NAEP*s  reading  and  writing  assessment  instruments  and 
the  design  effects  in  the  1983-84  sample  are  also  presented.  A 
O  >ssary  of  terms  and  a  124-item  reference  list  complete  the 
ERJCiunent.  (JGL) 


00  IMPLEMENTING 

00  THE  NEW  DESIGN: 

00 

GO 

Q 

THE  NAEP  1983-84 
TECHNIOIL  REPORT 


Albert  E.  Beaton 


-PERMISSION  TO  REPRODUCE  THIS 
MATERIAL  HAS  BEEN  GRANTED  BY 


TO  THE  EDUCATIONAL  RESOURCES 
INFORMATION  CENTER  (ERIC)" 


REPORT  NO:  15-TR-20 


'THE  NATION'S 
REPORT 


CARD 


raep 


U.S.  DEPARTMENT  Of  COUCATfON 
Oftic*  of  Educational  RMMtch  tnd  lmprov«m»nt 

EDUCATIONAL  RESOURCES  INFORMATION 
CENTER  (ERIC) 

I^This  documtnt  h«,  b««n  reproduced  at 
rec«iv«d  from  trv^  parson  or  organization 
originatir^  it 

□  Minor  changaa  hava  bean  mada  to  improve 
raproduction  qualtty 

a  Pointa  of  vtaw  or  opinions  stated  m  thia  docu- 
mtnt do  not  nacasaarily  represent  official 
OERI  position  or  policy 


National 
Assessment  of 
Educational 
Progress 


BEST  COPY  AVAIlABLb 


r 


IMPLEMENTING 
THE  NEW  DESIGN: 


THE  NAEP  1983-84 
TECHNICAL  REPORT 


Albert  E.  Beaton 


in  collaboration  with 

John  L.  Barone;  Anne  Campbell;  John  J.  FerriS; 
David  S.  Freund,  Eugene  G.  Johnson,  Janet  R.  Johnson, 
Bruce  A  Kaplan,  Debra  L  Kline,  Robert  J.  Mislevy, 
Ina  V.  S.  MuUis,  Norma  A.  Norris,  Alfred  M.  Rogers, 
Katnleen  M.  Sheehan,  Marilyn  VWngersky,  Rebecca  Zwick 

Educational  Testing  Service  •  Princeton,  NJ 

and 

John  Burke,  Nancy  Caldwell,  Morris  H.  Hansen, 
Josefina  A.  Lago,  Renee  Slobasky,  Benjamin  J.  Tepping 

Westat,  Inc.  •  Washington,  DC 


March  1987 


"THE  NATION'S 
REPORT 


CARD 


raep 


i 


National 
Assessment  of 
Educational 
Progress 


ERIC 


The  National  Assessment  of  Educational  Progress  is  funded  by  the  U  S  Department  of  Education  under  a  grant  to 
Educational  Testing  Service  N'ational  Assessment  is  an  education  research  project  mandated  b>'  Congress  to  collect 
data  over  time  on  the  performance  of  young  Amencans  m  various  learning  areas  It  makes  available  mformalion  on 
assessment  procedures  to  state  and  local  education  agencies 

This  report,  No  l5*TR-20,  can  be  ordered  from  the  National  Assessment  of  Educational  P. ogress  at  Educational  1  esting 
Service,  Rosedale  Road,  Princeton,  New  Jer^^  085^1 

Ubraiy  of  Congress  Catalog  Card  Number  87-60432 
ISBN  0*886a5-062-2 

The  work  upon  which  this  publication  is  based  was  performed  pursuant  to  Grant  No  ME-G-83-OOl  1  of  the  Office  for 
Educational  Research  and  Improvement.  Center  for  Statistics  It  does  not,  however,  necessanly  reflect  the  viewb  of  that 
agency 

Educational  Testing  Semce  is  an  equal  opportunity /affirmative  action  employer. 


Educational  Testing  Service,  ETS,  and 


are  registered  trademarits  of  Educational  Testing  Service 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 


CONTENTS 


PAGE 

Index  of  Tables  emd  Figures  vli 

Executive  Suamary  xv 

Acknovledgnents  xvii 

PART  I 

Chapter  1         Introduction  to  the  Technical  Report    3 

Albert  E.  Beaton 

Chapter  2        Overview  of  Part  I:    The  Design  and  Implementation  of 

the  Year  15  NAEP  (and  tabular  summary  of  NAEP  data).  •  13 
Albert  E.  Beaton 

Chapter  3        Development  of  the  Year  15  NAEP  Reading  and  Writing 

Assessments   47 

Ina  V.  S.  Kullis 

Chapter  4        Sample  Selection  and  Instrument  Collection    79 

Morris  H.  Hansen,  Benjamin  J.  Tepping, 
Josef ina  A.  Lago,  John  Burke 

Chapter  5        The  Assignment  of  Exercises  to  Students   97 

Albert  E.  Beaton,  Eugene       Johnson,  John  J.  Ferris 

Chapter  6        Instrument  and  Item  Information   119 

Janet  R.  Johnson 

Chapter  7        Fielri  Administration    I35 

Renee  Slobasky,  Nancy  Caldwell 

Chapter  8        Materials  Processing  and  Database  Creation    161 

John  L.  Barone 

Chapter  8.1     Processing  Assessment  Materials   165 

Alfred  H.  Rogers,  Norma  A.  Norris 

Chapter  8.2     Professional  Scoring    I75 

Anne  Campbell 

ill 


ERLC 


5 


PAGE 


Chapter  8-3     Data  Entry  System   185 

Alfred  M.  Rogers 

Chapter  8.4     Editing  Data   201 

Alfred  M.  Rogers 

Chapter  8.5     Quality  Control   205 

John  J,  Ferris 

Chapter  8.6     Database  Creation   211 

Alfred  M,  Rogers 

Chapter  8.7     Public-Use  Data  Tape  Construction   215 

Alfred  M.  Rogers 

PART  II 

Chapter  9        Overview  of  Part  II:    Analysis  of  the  Year  15 

NAEP  Data   225 

Albert  E.  Beaton 

Chapter  10       The  Reading  Data  Analysis:    Introduction    239 

Robert  J.  Mislevy 

Chapter  10.1    Assessment  of  the  Dimensionality  of  Year  15 

Reading  Data   245 

Rebecca  Zvick 

Chapter  10.2    Joint  Estimation  Procedures   285 

Marilyn  Vingersky,  Bruce  A,  Kaplan, 
Albert  E.  Beaton 

Chapter  10.3    Marginal  Estimation  Procedures    293 

Robert  J.  Mislevy,  Kathleen  M.  Sheehan 

Chapter  10.4    Trend  Analysis    361 

Robert  J.  Mislevy,  Kathleen  M.  Sheehan 

Chapter  10.5    The  NAEP  Reading  Scale   381 

Albert  E.  Beaton 

Chapter  11       The  Writing  Data  Analysis:    Introduction    391 

Albert  E.  Beaton 

Chapter  11.1    The  Writing  Exercise  Data   397 

Albert  E.  Beaton 


Chapter  11.2    The  Effect  of  Mode  of  Item  Administration  (BIB  Spiral 

or  Paced  Tape)  on  Estimates  of  Writing  Performemce  .  .  405 
Eugene  G.  Johnson 

iv 


ERIC 


6 


PAGE 


Chapter  11.3    Estimation  of  Trends  in  Writing  Achievement   431 

Eugene  G.  Johnson 

Chapter  11.4    The  Average  Response  Method  (ARM)  of  Scaling    435 

Albert  E.  Beaton,  Eugene  G.  Johnson 

Chapter  12       Background  and  Attitude  Data  Analysis   481 

Albert  E.  Beaton,  Norma  A.  Norris, 
Janet  R.  Johnson 

Chapter  13       Parameter  Estimation    491 

Albert  E.  Beaton 

Chapter  13.1    Weighting  Procedures    493 

Eugene  G.  Johnson,  Morris  H.  Hansenj 

Benjamin  J.  Tepping,  Josef ina  A.  Lago,  John  Burke 

Chapter  13.2    Estimation  of  Uncertainty  Due  to  Sampling  Variability  505 

Eugene  G.  Johnson 

Chapter  13.3    Estimation  of  Variability  Due  to  Imputation   513 

Robert  J.  Mislevy 

Chapter  13.4    Use  of  the  NAEP  Almanacs   517 

Rebecca  Zwick 

Chapter  14       Supplementary  Studies   523 

Albert  E.  Beaton 

Chapter  14.1    Validity  Issues  in  NAEP:    Year  15  Reading 

and  Writing  Assessments   525 

Rebecca  Zwick 

Chapter  14.2    Design  Effects    545 

Eugene  G.  Johnson 

PART  III 

Chapter  15       Estimates  of  the  Reading  and  Writing  Proficiency 

of  American  Students    565 

Albert  E.  Beaton,  David  S.  Freund, 
Bruce  A.  Kaplan 

Appendix  A:  Assessment  Items    645 

Appendix  B:  Reading  Trend  Analysis  Items    675 

Glossary   735 

List  of  References   751 

Subject  Index   763 

V 


7 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 


INDEX  OF  TABLES  AND  FIGURES 

NUMBER  TITLE  PAGE 

2(1)  NAEP  learning  areas,  grades  and  ages  assessed   16 

2(2)  Measurement  instruments  developed  by  ETS    29 

2(3)  Number  of  items  administered    30 

2(A)  Number  of  reading  and  writing  exercises  by 

type  of  administration   31 

2(5)  Allocation  of  PSUs  to  regions  and  community  types.   ...  32 

2(6)  Characteristics  of  schools    33 

2(7)  Number  of  responses  to  teacher  questionnaire    35 

2(8)  Number  of  assessment  sessions  by  administration  type  .  .  36 

2(9)  Number  of  students  by  administration  type   37 

2(10)  Spiral  sample  by  demographic  characteristics, 

Grade  4/Age  9   38 

2(11)  Spiral  sample  by  demographic  characteristics. 

Grade  8/Age  13   39 

2(12)  Spiral  sample  by  demographic  characteristics. 

Grade  11/Age  17.  .  .   AO 

2(13)  Excluded  student  sample  by  demographic 

characteristics.  Grade  A/Age  9    Al 

2(1A)  Excluded  student  sample  by  demographic 

characteristics.  Grade  8/Age  13   A2 

2(15)  Excluded  student  sample  by  demographic 

characteristics.  Grade  11/Age  17    A3 

2(16)  Tape  sample  by  demographic  characteristics. 

Grade  A/Age  9   AA 

2(17)  Tape  sample  by  demographic  characteristics. 

Grade  8/Age  13   A5 

2(18)  Tape  sample  by  demographic  characteristics. 

Grade  11/Age  17   A6 

4(1)  Summary  of  school  participation  experience    87 

A(2)  Weighted  and  unweighted  distribution  of  excluded 

students,  by  reason  for  exclusion  and  grade/age.  ...  91 
4(3)  Comparisons  of  Year  15  target  assessments  to 

actual  assessments,  by  grade/age    92 

4(4)  Comparison  of  Year  15  and  Year  13  proportion  of 

excluded  students,  by  grade/age   93 

4(5)  Comparison  of  Years  15  and  Year  13  student 

participation  rates,  by  type  of  PSU  and  grade/age.  .  .  94 
A(6)  Distribution  of  teachers  by  age  class  and 

participation  status    96 

vii 


ERIC 


8 


NUMBER                                           TITLE  PAGE 

5(1)            Sample  design  summary   101 

5(2)            Booklet  design,  BIB  spiral  sample   102 

5(3)            Spiral  sample,  block-to-booklet  correspondence    105 

5(4)            Booklet  design,  UBIB  spiral  sample    106 

5(5)            Number  of  pairings  of  item  blocks  in  spiral  design  .  .   .  107 

5(6)            Number  of  booklets  administered   Ill 

5-1             BIB  spiral  sample,  number  of  students  per  block   113 

5(7)  Number  of  blocks  administered,  spiral  and  tape  samples  .  114 
5-2             BIB  spiral,  UBIB  spiral  and  tape  samples:  Number  of 

students  per  booklet   115 

6(1)            Booklet  contents  by  block.  Grade  4/Age  9    120 

6(2)  Booklet  contents  by  block,  Grade  8/Age  13  and 

Grade  11/Age  17   121 

6(3)            Assessment  items  for  Grade  4/Age  9   123 

6(4)            Assessment  items  for  Grade  8/Age  13   124 

6(5)            Assessment  items  for  Grade  11/Age  17    125 

6(6)            Items  by  block  on  tapes.  Age  9   128 

6(7)            Items  by  block  on  tapes*  Age  13   129 

6(8)            Items  by  block  on  tapes.  Age  17   130 

6(9)            Year  15  writing  items   131 

7(1)            Criteria  met  by  NAEP  supervisors,  by  region   138 

7(2)            Frequency  of  makeup  sessions    147 

7(3)            Regular  and  makeup  sessions  conducted   147 

7(4)            Change  in  attendance  rates  with  makeup  sessions   147 

7(5)            Number  and  distribution  of  quality  control  visits.   .  .   .  153 

8-1             NAEP  data  flow  overview   162 

8.2(1)         Distribution  of  reading  and  writing  exercises   178 

8.3-1          Student  data  entry  processing   186 

10.1(1)       Number  of  items  and  students  available  for 

dimensionality  analyses   252 

lO.l(Z)       Eigenvalues  and  descriptive  statistics  for 

phi  matrices   254 

10.1(3)       Eigenvalues  and  descriptive  statistics  for 

tetrachoric  matrices    255 

10.1(4)       Eigenvalues  of  the  image  correlation  matrix   260 

10.1(5)       First  five  eigenvalues  of  correlation  and  image 

correlation  matrices  for  simulation  data    261 

10.1(6)       Results  of  Rosenbaum  analyses   267 

10.1(7)       Subjects  available  to  estimate  within-  and  across- 

block  correlations  for  30-item  BIB  simulation   270 

10.1(8)       Distribution  of  residual  correlations  for 

30-item  BIB  simulation    271 


viii 

o 

ERIC 


NUMBER 


TITLE 


PAGE 


10.1(9) 

10.2(1) 
10.2(2) 

10.2-  1 

10.3(1) 

10.3-  1 
10.3-2 
10.3(2) 
10.3(3) 
10.3(4) 
10.3(5) 
10.3(6) 
10.3(7) 
10.3(8) 

10.3(9) 

10.3(10) 

10.3-3 
10.3-4 
10.3-5 
10.3-6 


Partial  comparison  of  eigenstructure  of  BIB  and  complete 
data  correlation  matrices  for  30-item  simulation  .  .  .  272 


10 
10 
10 
10 
10 
10 


3-7 
3-8 
3-9 
3-10 
3-11 
3-12 
10.3-13 
10.3-14 
10.3-15 
10.3-16 
10.3-17 
10.3-18 
10.3-19 
10.3-20 
10.3(11) 

10.3-21 
10.3-22 
10.3-23 
10.3(12) 


Values  assigned  to  examinees  whose  maximum 

likelihood  estimates  could  not  be  computed  .  .  . 

Minimum  and  maximum  scores  and  number  of  values 
changed  by  grade/age   

Distributions  of  adjusted  proficiency  scale  scores 


290 


Blocks  selected  for  scaling  the  Year  15  reading  dat 

Diagnostic  fit  plot  for  item  9   

Bias  plot  for  item  10  

Coding  of  background  variables,  BIB  data 
Estimated  conditional  effects,  BIB  data. 
Sampling  procedure  used  to  generate  plausible  values 
Reliability  coefficients  by  booklet.  Grade  4/Age  9 
Reliability  coefficients  by  booklet.  Grade  8/Age  13 
Reliability  coefficients  by  booklet.  Grade  11/Age  1 
Approximate  shrinkage  of  regression  coefficients  of 

nonconditioned  background  variables.  Grade  4/Age  9 
Approximate  shrinkage  of  regression  coefficients  of 

n'.nconditioned  background  variables.  Grade  8/Age  13 
Approximate  shrinkage  of  regression  coefficients  of 

nonconditioned  background  variables.  Grade  11/Age  17 
BIB-Pace  percent  correct.  Age  9  total,  IRT  items 

Age  9 
Age  9 
Age  9 
Age  9 
Age  9 


BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct. 


male,  IRT  items 
female,  IRT  items, 
white,  IRT  items  . 
black,  IRT  items  . 
Hispanic,  IRT  items 


Age  13  total,  IRT  items. 


BIB-Pace  percent  correct.  Age  13  male,  IRT  items 

BIB-Pace  percent  correct.  Age  13  female,  IRT  items 

BIB-Pace  percent  correct.  Age  13  white,  IRT  items. 

BIB-Pace  percent  correct.  Age  13  black,  IRT  items. 


BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct, 
BIB-Pace  percent  correct. 


Age  13  Hispanic,  IRT  items 
Age  17  total,  IRT  items. 
Age  17  male,  IRT  items  . 
Age  17  female,  IRT  items 
Age  17  white,  IRT  items. 
BIB-Pace  percent  correct.  Age  17  black,  IRT  items. 
BIB-Pace  percent  correct.  Age  17  Hispanic,  IRT  items 
Correlations  and  regression  coefficients  for  spiral 
vs.  paced  percent    correct  of  IRT  reading  items 

BIB/Pace  populatior  equating.  Age  9  

BIB/Pace  population  equating.  Age  13   

BIB/Pace  population  equating.  Age  17   

BIB/Pace  subgroup  means.  Age  9   


ix 


ERIC 


10 


NUMBER                                           TITLE  PAGE 

10.3(13)      BIB/Pace  subgroup  means,  Age  13   355 

10.3(1A)      BIB/Pace  subgroup  means,  Age  17   357 

10.3-2A       Comparison  of  estimated  b  values.  Age  9  vs.  13    359 

10.3-25       Comparison  of  estimated  b  values.  Age  17  vs.  13   360 

lO.A(l)       Booklets  selected  for  calibrating  trend  items   364 

10.A(2)       "Changed"  reading  items   365 

10.A(3)       Proportion  correct  for  questionable  items   366 

lO.A-1         Diagnostic  plot  for  item  87   368 

10.4^2         Plots  of  items  excluded  during  preliminary  calibrations*  369 

lO.A(A)       Items  excluded  during  preliminary  calibration  runs  .  .  .  372 

10.A(5)       Item  calibration  summary    372 

10.A(6)       Additional  booklets  used  for  estimating 

conditional  distributions   375 

10.A(7)       Estimated  conditional  effects.  Year  2  Pace  data   376 

10.A(8)       Estimated  conditional  effects.  Year  6  Paci  data   377 

10.A(9)       Estimated  conditional  effects,  Year  11  Pace  data  ....  378 

lO.A(lO)      Estimated  conditional  effects,  Year  15  Pace  data  ....  379 

!0.5-l         Levels  of  proficiency   389 

11(1)           Year  15  writing  items   394 

11.1(1)       Percentages  of  exact  score  point  agreement  and 

intra-class  correlation  coefficients  for  primary 

trait  scoring,  Year  15   398 

11.1(2)  Reliability  statistics  for  primary  tiait  ratings  ....  ^00 
11.1(3)       Percentages  of  exact  score  point  agreement  and 

intra-class  correlation  coefficients  for  primary 

trait  scoring  conducted  in  1983-8A    ^01 

11.1(4)       Number  of  students  responding  to  each  writing  exercise  .  -403 

11.2(1)  Writing  exercises  selected  for  the  BIB/Pace  comparison  .  ^09 
11.2(2)       Effect  of  mode  of  ad.jinistration  on  writing 

performance.  Age  9,  ".\unt  May"   /ilO 

11.2(3)       Effect  of  mode  of  administrc'tion  on  writing 

performance.  Age  9,  "Dali"   /il2 

11.2(A)       Effect  of  mode  of  administiation  on  writing 

performance.  Age  9,  "Hole  in  the  Box"   A14 

11.2(5)       Effect  of  mode  of  administration  on  writing 

performance.  Age  13,  "Split  Session"    A16 

11.2(6)       Effect  ot  mode  of  administration  on  writing 

performance,  Age  13,  "Dali"   418 

11.2(7)       Effect  of  mode  of  administration  on  writing 

performance.  Age  13,  "Hole  in  the  Box"   A20 

] 1.2(8)       Effect  of  mode  of  administration  on  writing 

performance.  Age  17,  "Split  Session"   /♦22 

11.2(9)       Effect  of  mode  of  administration  on  writing 

performance.  Age  17,  "Dali'*   42A 


X 


ERLC 


11 


NUMBER 


TITLE 


PAGE 


11.2(10)      Effect  of  mode  of  administration  on  v^iting 

performance,  Age  17,  "Hole  in  the  Box"   A26 

11.2-1         Difference  between  BIB  and  Pace  percentages.  Age  9  .  .  .  428 

11.2-2        Difference  between  BIB  and  Pace  percentages,  Age  13.  .  .  429 

11.2-3        Difference  between  BIB  and  Pace  percentages.  Age  17.  .  .  430 

11.3(1)       Exercises  used  to  estimate  trends  in  writing  peiforinance  433 

11.4(1)       Writing  items  f  r  the  ARM  writing  scale   450 

11.4(2)       Disfribution  by  grade  of  the  number  of  writing 

scale  items  taken  by  a  student   450 

11.4-la       Grade  4  conditioned  variables,  group  effects    465 

11.4-lb       Grade  8  conditioned  variables,  group  effects    466 

11.4-lc       Grade  11  conditioned  variables,  group  effects   467 

11.4-2a       Grade  4  unconditioned  variables,  group  elfects    468 

11.4-2b       Grade  8  unconditioned  variables,  group  effects    469 

11.4-2C       Grade  11  unconditioned  variables,  group  effects   470 

11.4-3a       Grade  4  F-values  conveited  to  N(0,1)   47'i 

11.4-3b       Grade  8  F-values  conv*»rted  to  N(0,1)   476 

11.4-3C       Grade  11  F-vaTjes  converted  to  N(0,i)   477 

11.4-4a       Grade  4  SE  (ARM)  vs.  SE  (Meanparts)   478 

11.4-4b       Grade  8  SE  (ARM)  vs.  SE  (Meanparts)   479 

11.4-4C       Grade  11  SE  (ARIl)  vs.  SE  (Meanparts)   480 

12(1)  Reporting  subgroup  variables    482 

12(2)  Race/ethnicity  classifications    485 

12(3)  Determining  race/ethnicity    486 

12(4)  Geographic  regions    488 

13.1(1)       Major  subgroups  for  post-stratification   503 

11.4(1)       Year  15  almanacs,  dates  of  issue  and  comments   518 

14.1(1)       Reading  and  writing  (ARM)  proficiency  maans  for 

selected  groups   530 

14.1(2)       Correlations  of  reading,  writing,  and  selected 

background  variables   53*^ 

14.1(3)       Correlations  of  reading,  writing,  PSAT  scores,  and 

selected  background  variables     536 

14.1(4)       Definition  of  variables  for  analyses   53"' 

14.1(5)       Correlations  of  rending  and  writing  with  frequency 

of  reading  and  writing  activities   538 

14.1(6)       Item  text  and  response  codes  for  reading  and  writing  .  .  539 

14.2(1)       Distributions  of  design  effects  for  demographic 

subgroups.  Grade  4   551 

14.2-la       Grade  4,  log  base  10  of  design  effects   552 

14.2-lb       Grade  4,  log  base  10  of  design  effects   553 


xi 


12 


NUMBER 


TITLE 


PAGE 


14.2-lc       Grade  4,  log  base  10  design  effects   554 

14.2(2)       Distributions  of  design  effects  for  demographic 

subgroups  y  Grade  8   555 

14.2-2a       Grade  8,  log  base  10  of  design  effects   556 

14.2-2b       Grade  8,  log  base  10  of  design  effects   557 

14.2-2c       Grade  8,  log  base  10  of  design  effects   558 

14.2(3)       Distributions  of  design  effects  for  demographic 

subgroups,  Grade  11   559 

14.2-3a       Grade  11,  log  base  10  of  design  effects   560 

14.2-3b       Grade  11,  log  base  10  of  design  effe-^ts   561 

14.2-3c       Grade  11,  log  base  10  of  de??gn  effects   562 

15(1)  Number  of  students  by  j5rade/age  combination  and 

by  type  of  assessment  571 

15(2)  Number  of  spiral  students  by  grade/age    572 

15(3)  Estimated  total  number  of  students  in  the  population 

eligible  for  spiral  assessment.  Grade  4/Age  9  575 

15(4)  Estimated  total  number  of  students  in  the  population 

eligible  for  spiral  assessment.  Grade  8/Age  13  ...  .  576 
15(5)  Estimated  total  number  of  students  in  the  population 

eligible  for  spiral  assessment.  Grade  11/Age  17.  .  .  .  577 

15(6)  Estimated  total  number  of  students  in  the  population 
eligible  for  spiral  assessment  who  would  be  deemed 
unassessable  by  their  schools.  Grade  4/Age  9    578 

15(7)  Estimated  total  number  of  students  in  the  population 
eligible  for  spiral  assejsment  who  would  be  deemed 
unassessable  by  their  schools.  Grade  8/Age  13  579 

15(8)  Estimated  total  number  of  students  in  the  population 

eligible  for  spiral  assessment  who  would  be  deemed 
unassessable  by  their  schools.  Grade  11/Age  17  ...   .  580 

15(9)  Estimated  total  number  of  students  eligible  for 

assessment  by  tape  sample.  Grade  4/Age  9    581 

15(10)         Estimated  total  number  of  students  eligible  for 

assessment  by  tape  sample.  Grade  8/Age  13  582 

15(11)         Estimated  total  number  of  students  eligible  for 

assessment  by  tape  sample.  Grade  11/Age  17    583 

15(12)^?.       Number  of  students  receiving  reading  and  writing 

items  and  plausible  values.  Grade  A/Age  9  584 

15(l?.)b       Weighted  counts  of  students  receiving  reading  and 

writing  items  and  plausible  values.  Grade  4/Age  9.  .  .  585 
15(13)a       Number  of  students  rec  iving  reading  and  writing 

items  and  plausible  values.  Grade  8/Age  13    586 

15(13)b       Weighted  counts  of  students  receiving  reading  and 

writing  items  and  plausible  values.  Grade  8/Age  13  .  .  587 
15(14)a       Number  of  students  receiving  reading  and  writing 

items  and  plausible  values.  Grade  11/Age  17  ^^88 

xii 


13 


NUMBER                                           TITLE  PAGE 

15(14)b  Weighted  counts  of  students  receiving  reeding  and 

writing  items  and  plausible  values,  Grade  11/Age  17.  .  589 

15(15)  General  reading  proficiency  estimates,  Grade  4    590 

15(16)  General  reading  proficiency  estimates.  Grade  8    591 

15(17)  General  reading  proficiency  estimates.  Grade  11   592 

15(18)  ARM  writing  proficiency  estimates.  Grade  4    593 

15(19)  ARM  writing  proficiency  estimates.  Grade  8    594 

15(20)  ARM  writing  proficiency  estimates.  Grade  11   595 

15(21)  Reading  proficiency,  imputed  student  grade.  Grade  4.  .  .  596 

15(22)  Reading  proficiency,  student  sex.  Grade  4   597 

15(23)  Reading  proficiency,  ethnicity/race.  Grade  4    598 

15(24)  Reading  proficiency,  region.  Grade  4    599 

15(25)  Reading  proficiency,  imputed  student  age.  Grade  4.  .  .  .  600 

15(26)  Reading  proficiency,  size/ type  of  communicy.  Grade  4  .  .  601 

15(27)  Reading  proficiency,  parental  education.  Grade  4  •  •  •  •  602 

15(28)  Reading  proficiency,  X  at  or  above  anchor.  Grade  4  .  .  .  603 

15(29)  Reading  proficiency,  imputed  student  grade.  Grade  8.  .  .  604 

15(30)  Reading  proficiency,  student  sex.  Grade  8   605 

15(31)  Reading  proficiency,  ethnicity/ race.  Grade  8    606 

15(32)  Reading  proficiency,  region.  Grade  8    607 

15(33)  Reading  proficiency,  imputed  student  age.  Grade  8.  •  .  •  608 

15(34)  Reading  proficiency,  size/type  of  community.  Grade  8  .  .  609 

15(35)  Reading  proficiency,  parental  education.  Grade  8  .  .  .  .  610 

15(36)  Reading  proficiency,  %  at  or  above  cpchor.  Grade  8  .  .  .  611 

15(37)  Reading  proficiency,  imputed  student  grade.  Grade  11  .  .  612 

15(38)  Reading  proficiency,  student  sex.  Grade  11    613 

15(39)  Reading  proficiency,  ethnicity/race.  Grade  11   614 

15(40)  Reading  proficiency,  region.  Grade  11   615 

15(41)  Reading  proficiency,  imputed  student  age.  Grade  11  •  •  •  616 

15(42)  Reading  proficiency,  size/type  of  community.  Grade  11.  .  617 

15(43)  Reading  proficiency,  parental  education.  Grade  11.  .  .  .  618 

15(44)  Reading  proficiency,  %  at  or  above  anchor.  Grade  11.  .  .  619 

15(45)  Writing  proficiency,  imputed  student  grade.  Grade  4.  .  •  620 

15(46)  Writing  proficiency,  student  sex.  Grade  4   621 

15(47)  Writing  proficiency,  ethnici ty/race.  Grade  4    622 

15(48)  Writing  proficiency,  region.  Grade  4    623 

15(49)  Writing  proficiency,  imputed  student  age.  Grade  4.  .  .  .  624 

15(50)  Writing  proficiency,  size/type  of  community.  Grade  4  .  .  625 

15(51)  Writing  proficiency,  parental  education.  Grade  4  .  .  .  .  626 

15(52)  Writing  proficiency,  X  at  or  above  anchor.  Grade  4  .  .  .  627 

15(53)  Writing  proficiency,  imputed  student  grade.  Grade  8.  .  .  628 

15(54)  Writing  proficiency,  student  sex.  Grade  8   629 


xiii 

14 

ERLC 


NUHBER                                          TITLE  PAGE 

15(55)  Writing  proficiency,  ethnicity/race,  Grade  8    630 

15(56)  Writing  proficiency,  region.  Grade  8    631 

15(57)  Writing  proficiency,  imputed  student  age.  Grade  8*  •  •  •  632 

15(58)  Writing  proficiency,  size/ type  of  community.  Grade  8  .  .  633 

15(59)  Writing  proficiency,  parental  education,  Grade  8  .  .  .  •  634 

15(60)  Writing  proficiency,  %  at  or  above  anchor.  Grade  8  •  •  •  635 

15(61)  Writing  proficiency,  imputed  student  grade,  Grade  11  .  .  636 

15(62)  Writing  proficiency,  student  sex,  Grade  11    637 

15(63)  Writing  proficiency,  ethnicity/race,  Grade  11   638 

15(64)  Writing  proficiency,  region.  Grade  11   639 

15(65)  Writing  proficiency,  imputed  student  age.  Grade  11  .  .  .  640 

15(66)  Writing  proficiency,  size/type  of  community.  Grade  11.  .  641 

15(67)  Writing  proficiency,  parental  education.  Grade  11.  .   .  .  642 

15(68)  Writing  proficiency,  %  at  or  above  anchor.  Grade  11.  .  .  643 


xiv 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAKP  1983-8A  TECHNICAL  REPORT 


EXECUTIVE  SUMMARY 


In  1982,  Educational  Testing  Service  (ETS)  proposed  to  implement  a  new, 
complex  design  for  the  National  Assessment  of  Educational  Progress  (NAEP). 
The  major  features  of  the  design  are  described  in  A  New  Design  for  A  New 
Era  (Messick,  Beaton,  &  Lord,  1983).  ETS  received  tHe  NAEP  grant  on  July  1, 
T983    The  purpose  of  this  technical  report  is  to  document  the  implementa- 
tionof  the  design  in  the  1983-8A  national  assessment  of  reading  and 
writing. 

This  is  a  technical  report  describing  assessment  processes;  it  is  not 
intended  to  report  and  interpret  the  performance  of  students  at  various 
grade  and  age  levels.  Such  results  are  presented  in  the  NAEP  reports  The 
Reading  Report  Card:    Progress  Toward  Excellence  in  Our  Schools  (19i5), 
Writing:  Trends  Across  the  Decade,  197A-8A  (Applebie,  Langer,  &  Mullis, 
1<^»fia)T  and  The  Writing  Report  Card;    Writing  Achievement  in  American 
Schools,  198A  (Applebee,  Langer,  &  Mullis,  1986b).  This  report  supports 
those  reports  by  providing  detailed  technical  information  so  that  the 
adequacy  of  the  substantive  results  can  be  judged. 

The  introduction  to  this  report  presents  the  major  features  of  the  ETS 
design  and  shows  how  ETS  met  or  exceeded  its  commitments.  Some  major 
features  were:  to  sample  gradea  A,  8,  and  11  as  well  as  9-,  13-,  and  in- 
school  17-year-olds;  to  introduce  BIB  spiralling  as  a  method  of  estimating 
inter-relationships  among  variables;  to  collect  extensive  information  about 
teachers,  principals,  and  schools;  and  to  scale  the  reading  data,  if 
possible.  These  innovations  were  to  be  introduced  in  addition  to  the 
previously  used  procedures  which  wtre  kept  to  ensure  the  maintenance  of 
NAEP  trends. 

The  rest  of  the  technical  report  is  divided  into  three  parts.  Part  I 
describes  the  processes  involved  in  collecting  the  NAEP  data.  The  story 
begins  with  the  development  of  the  NAEP  data  assessment  instruments  for 
reading  and  writing.  A  representative  sample  of  American  students  in  public 
and  private  schools  was  selected  by  Westat,  Inc.,  the  ETS  subcontractor  for 
sampling  and  field  administration.  BIB  spiralling  was  used  to  assign 
assessment  exercises  to  students  to  allow  the  study  of  inter-relationships 
of  assessment  exercises  within  a  subject  area,  such  as  reading  or  writing, 
and  between  the  subject  areas  and  the  background  and  attitude  questions. 
Westat 's  field  administration,  from  contacting  the  schools  through  checking 
for  completeness  of  data,  is  detailed.  The  processes  involved  in  converting 
the  responses  of  students  from  their  assessment  booklets  to  a  carefully 
edited  database  is  discussed.  Finally,  the  quality  control  checks  are 
described. 


XV 


ERIC 


Part  II  describes  the  data  analyses.  The  analysis  of  the  reading  data 
is  reported  first.  The  dimensionality  of  the  reading  data  was  explored,  and 
no  reason  was  found  to  reject  the  assumption  of  unidimensionality  for  the 
exercises  used  in  the  reading  scale.    Using  item  response  theory,  the  item 
parameters  of  the  reading  exercises  were  estimated,  and  individual  student 
distributi  ons  of  plausible  values  for  reading  proficiency  were  created. 
Plausible  values  are  a  device  for  encoding  both  vuat  we  know  and  what  we  do 
not  know  about  an  individual's  proficiency,  A  single  scale,  linking  the 
three  ages  and  grades,  was  developed;  this  scale  was  linked  to  data 
collected  in  past  assessments,  back  to  the  1970-71  school  year.  The  NAEP 
reading  scale  was  behaviorally  anchored  to  enable  succinct  reporting  of 
what  the  population  of  students  can  and  cannot  do. 

A  NAEP  writing  scale  was  also  produced  by  the  development  of  a  scaling 
procedure  called  the  Average  Response  Method,  Th:s  method  estimates  how 
students  would  have  performed  if  they  had  been  administered  a  particular 
set  of  ten  writing  exercises.  The  NAEP  writing  scale  was  applied  to  the 
students  in  the  fourth,  eighth,  and  eleventh  grades  only,  A  study  of  the 
effect  of  changing  the  way  in  which  NAEP  was  administered  showed 
substantial  differences  in  response  patterns  and  so  the  data  from  past 
assessments  were  not  merged  with  the  BIB  spiralled  data.  For  the  writing 
trend  report,  the  trends  were  maintained  using  the  "bridge"  data,  which 
were  collected  using  a  tape  recorder  and  reported  using  average  percentages 
correct,  as  in  past  writing  reports, 

A  method  for  scaling  background  and  attitude  questions  was  also 
developed,  but  has  not  yet  been  used  extensively. 

The  processes  involved  in  estimating  the  performances  of  populations  of 
students  is  described  next  in  the  report.  These  processes  include  the 
computation  of  sampling  weights,  the  estimation  of  sampling  error  using  the 
jackknife  method,  and  the  estimation  of  variability  due  to  imputation. 
Finally,  the  use  of  the  NAEP's  many  tables  of  parameter  estimates  is 
described. 

Two  supplementary  studies  are  also  presented:  the  first  is  a  study  of 
the  validity  of  the  NAEP  reading  and  writing  assessment  instruments;  the 
second  is  a  study  of  the  design  effects  in  the  1983-84  sample. 

Part  III  presents  some  estimates  of  the  reading  and  writing 
proficiencies  of  the  sampled  populations  of  students.  Many  thousands  of 
pages  of  such  tables  have  been  developed,  far  too  many  to  include  in  this 
report.  This  part  of  the  report  presents  a  few  selected  tables  which 
estimate  the  proficiencies  of  important  subpopulations,  such  as  the 
different  genders,  racial/ethnic  groupings,  and  regions  of  the  country. 

The  NAEP  1983-84  data,  as  well  as  data  from  all  past  assessments,  are 
available  on  public-use  data  tapes.  Ail  student,  teacher,  principal,  and 
school  data  are  available,  except  for  a  few  items  which  might  compromise 
the  confidentiality  of  the  respondents.  The  plausible  values  for  reading 
and  writing  are  also  available  on  the  tape. 


XV  i 


ERLC 


17 


ACKNOWLEDGMENTS 


The  production  of  this  report,  as  well  as  the  data  and  data  analyses 

that  it  documents,  is  the  result  of  the  contributions  of  many  persons,  at 

ETS  and  elsewhere.  It  is  fitting  that  these  individuals  be  acknowledged  and 

receive  my  thanks  for  their  contributions  to  NAEP. 

I  wish  to  thank  the  ETS  management  for  its  support,  especially  Greg 
Anrig,  Bob  Solomon,  and  Henry  Braun,  and  also  the  NAEP  project  management, 
Archie  Lapointe,  Ina  Mullis,  Doug  Rhodes,  and  Paul  Barton. 

I  also  wish  to  thank  the  Assessment  Policy  Committee,  with  Wilmer  Cody 
as  Chair,  which  has  contributed  its  wisdom  and  support. 

I  am  also  grateful  to  Sam  Messick,  who  was  primarily  responsible  for 
the  creation  of  the  new  NAEP  design.  Many  other  persons  on  the  ETS  and 
Westat  staffs  have  made  substantial  contributions.  Fred  Lord  must  be 
singled  out  for  his  major  role. 

Ina  Mullis  has  had  a  major  influence  upon  our  evolving  conception  of 
NAEP.  Her  primary  responsibility  has  been  for  the  development  of  the 
measurement  instruments  and  the  interpretation  of  results. 

Jules  Goodison  was  primarily  responsible  for  the  operational  aspects  of 
the  1983-84  NAEP  and  has  been  a  major  contributor  to  the  implementation  of 
the  design. 

Kent  Ashworth  has  been  particularly  helpful  by  encouraging  the 
publication  of  this  and  other  reports. 

The  Technical  Advisory  Committee,  which  has  maintained  oversight  of  the 
NAEP  design  and  analysis,  has  been  outstanding.  The  members  are  Robert 
Linn,  University  of  Illinois  (chair);  Sylvia  Johnson,  Howard  University 
(vice-chair);  Robert  Glaser,  University  of  Pittsburgh;  Bert  Green,  Johns 
Hopkins  University;  Melvin  Novick,  University  of  Iowa;  Richard  Snow, 
Stanford  University;  and  Ingram  Olkin,  Stanford  University,  who  replaced 
Melvin  Novick  in  1986.    Barbara  Shapiro  served  as  liaison  between  the 
Assessment  Policy  Committee  and  the  Technical  Advisory  Committee. 


*  *  * 


The  implementation  of  the  data  analysis  was  the  function  of  the 
Psychometric  Research  Department,  of  which  I  am  director.  The  staff  members 
made  many  creative  contributions  to  the  statistical  and  psychometric 


xvii 


18 


aspects  of  NAEP,  as  well  as  managing  the  day-to-day  operation  of  the 
analyses.  I  am  particularly  grateful  for  the  contributions  of  Janet 
Johnson,  Gene  Johnson,  Fred  Lord,  Bob  Mislevy,  Kathy  Sheehan,  Marilyn 
Wingersky,  and  Rebecca  Zwick. 

I  am  also  very  grateful  to  the  Data  Analysis  Department,  under  the 
consummate  direction  of  John  Barone,  for  developing  the  operating  systems 
and  producing  the  data  analyses.  Developing  and  operating  the  NAEP  data 
management  systems  under  severe  time  constraints  were  the  province  of  Norma 
Norris  and    Alfred  Rogers.  Other  members  of  this  group  who  made  substantial 
contributions  of  their  talent  and  work  to  the  data  analyses  are  Laurie 
Barnett,  Jim  Ferris,  Dave  Freund,  Tom  Jirele,  Dick  Harrison,  Bruce  Kaplan, 
Ed  Kulick,  Faustino  Romero,  Ira  Sample,  Bill  Van  Hassel,  and  Minhwei  Wang. 
Although  all  of  these  persons  made  important  contributions,  I  would  be 
amiss  not  to  single  out  the  extraordinary  work  of  Norma  Norris.  Al  Rogers, 
Dave  Freund,  Bruce  Kaplan,  and  Laurie  Barnett. 

Westat,  Inc.  has  also  contributed  its  wealth  of  wisdom  and  experience 
to  this  report.  I  am  grateful  to  John  Burke,  Nancy  Caldwell,  Morris  Hansen, 
Jocefina  Lago,  Renee  Slobasky,  and  Ren  Tepping. 

Making  a  cohesive  document  out  of  the  many  chapters  in  this  report  was 
the  province  of  our  editor,  Debbie  Kline,  who  has  done  an  outstanding  job. 

Many  other  ETS  staff  members  made  creative  contributions  to  the  design, 
development,  and  implementation  of  NAEP.  I  am  particularly  indebted  to  Joan 
Baratz,  Anne  Campbell,  Kalle  Gerritz,  Peg  Goertz,  Ann  Jungeblut,  and  Gita 
Wilder. 

Without  an  able  secretarial  staff,  this  report  would  have  been 
impossible.  I  wish  to  thank  Sharon  Stewart,  Joyce  Thullen,  and  Marcy  Yates 
for  their  hard  work  and  dedication. 

Thanks  go  also  to  Kirk  Silvester  and  his  staff,  who  were  responsible 
for  the  printing  of  the  final  draft,  and  to  Pete  Stremic  and  his  staff,  who 
were  responsible  for  the  cover  design  and  printing  of  the  final  report- 

I  also  wish  to  express  my  gratitude  to  Professors  Darrell  Bock,  John  B. 
Carroll,  Lyle  Jones,  and  John  Tukey  for  their  important  assistance.  Don 
Searls,  of  the  Education  Commission  of  the  States,  and  Jim  Chromy,  of  the 
Research  Triangle  Institute,  deserve  our  special  thanks  for  their  gracious 
assistance  during  the  transitional  period. 


Albert  E.  Beaton 
Director  of  Data  Analysis 
National  Assessment  of  Educational 

Progress 
February  15,  1987 


xviii 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 

PART  I 


ERIC 


2u 


Chapter  1 
INTRODUCTION 


Albert  E.  Beaton 
Educational  Testing  Service 


In  1982,  Educational  Testing  Service  (ETS)  proposed  a  new,  complex 
design  for  the  National  Assessment  of  Educational  Progress  (NAEP).  The 
design  was  described  extensively  in  The  Conduct  of  the  National  Assessment 
of  Educational  Progress,  a  Proposal  in  Response  to  RFP  PA--82-001,  submitted 
by  ETS  to  the  National  Institute  of  Education,  November  17,  1982.  An 
overview  of  the  design  was  published  in  the  report  A  New  Design  for  A  New 
Era  (Messick,  Beaton,  &  Lord,  1983).  Three  years  have  passed  since  ETS 
received  the  grant  to  implement  its  design  for  NAEP;  the  concepts  in  the 
proposed  design  have  now  been  put  into  practice,  the  students  have  been 
assessed,  the  resulting  data  have  been  analyzed,  and  reports  have  been 
published.  The  purpose  of  this  technical  report  is  to  report  on  the 
implementation  of  the  1983-8A  (Year  15)  assessment. 

Our  aim  is  to  give  the  reader  sufficient  information  to  judge  the 
utility  of  the  design,  the  quality  of  the  NAEP  data,  the  reasonableness  of 
the  assumptions  made,  the  appropriateness  of  the  data  analyses,  and  the 
generality  of  the  inferences  made  from  the  data.  This  report  covers  only 
the  technical  aspects  of  the  Year  15  NAEP.  It  does  not  attempt  to  provide 
the  substantive  results  which  might  be  of  interest  to  educational  policy 
makers;  such  results  are  provided  in  the  reports  The  Reading  Report  Card: 
Progress  Toward  Excellence  in  Our  Schools  (1985) ,~Wri ting;  Trends  Across 
the  Decade,  1^74^84  (Applebee,  Langer,  &  Mullis,  1986a),  and  The  Writing 
Report  Card:    Writing  Achievement  in  American  Schools,  198A  (Applebee, 
Langer,  &  Mullis,  1986b).    The  purpose  of  this  technical  documentation  is 
to  support  those  reports  by  presenting  detailed  information  about  the  data 
and  analyses  that  were  interpreted  and  presented  in  the  reports.  Analyses 
performed  specifically  for  these  substantive  reports  are  discussed  in  the 
procedural  appendix  to  each  report. 

-k  -k  -k 

The  National  Assessment  of  Educational  Progress  is  a  continuing, 
congressionally  mandated,  national  survey  of  educational  achievement.  The 
Congressional  Act  (Public  Law  95-561-Nov.  1,  1978)  under  which  the  NAEP 
grant  was  offered  states  that 

[NAEPJ .. .shall  have  as  a  primary  purpose  the  assessment  of 
performance  of  children  and  young  adults  in  the  basic  skills  of 


3 


21 


reading,  mathematics,  and  communication.  Such  a  National 
Assessment  shall... 

(A)  collect  and  report  at  least  once  every  five  years 
data  assessing  the  performance  of  students  at 
various  age  or  grade  levels  in  each  of  the  areas  of 
reading,  writing,  and  mathematics; 

(B)  report  period  C3lly  data  on  changes  in  knowledge  and 
skills  of  such  students  over  a  period  of  time; 

(C)  conduct  special  assessments  of  other  educational 
areas,  as  the  need  for  additional  information 
arises;  and 

(D)  provide  technical  assistance  to  State  educational 
agencies  and  to  local  educational  agencies  on  the 
use  of  the  National  Assessment  objectives,  primarily 
pertaining  to  the  basic  skills  of  reading, 
mathematics,  and  comn  inication,  and  on  making 
comparisons  of  such  assessments  with  the  national 
profile  and  change  data  developed  by  the  National 
Assessment. 

NAEP  continues  to  fulfill  the  Congressional  mandate  and  also  gathers 
ancillary  data  which  can  be  of  use  in  interpreting  the  basic  findings  about 
the  knowledge  and  skills  of  young  Americans.  It  is  the  first  ongoing  effort 
to  obtain  comprehensive  and  dependable  achievement  data  on  a  national  basis 
in  a  uniform,  scientific  manner. 

NAEP  was  originally  designed  and  mandated  by  law  in  the  1960s,  and 
collected  its  first  data  in  1969.    The  NAEP  grant  was  administered  by  the 
Education  Commission  of  the  States  (ECS)  until  1983  when  the  grant  was 
moved  to  ETS.  Since  its  inception,  NAEP  has  collected  information  not  only 
on  reading,  writing,  and  mathematics,  as  required  by  lav,  but  also  on  a 
number  of  other  subject  areas  such  as  science,  citizenship,  art,  and  music. 
The  1983-84  (Year  15)  assessment,  described  in  this  technical  report, 
covered  reading  and  writing  as  well  as  numerous  background  and  attitude 
questions. 

Before  presenting  the  achievements  of  the  Year  15  assessment,  it  is 
important  to  recall  the  fourteen  prior  years  in  which  a  National  Assessment 
existed.  During  those  years,  the  vision  of  Ralph  Tyler,  Frank  Keppel,  and 
many  others  was  realized.  As  Hessick,  Beaton,  and  Lord  (1933)  asserted,  the 
National  Assessment  design  "...was  brilliantly  responsiv3  to  the  political 
constraints  of  the  time"  (p.  1).    The  design  was  also  brilliantly 
responsive  to  the  technical  constraints  of  the  time  and  has  been  shown  to 
have  been  far  ahead  of  its  time;  the  vision  of  Professor  John  Tukey,  of 
Princeton  University  and  the  first  Chairman  of  the  NAEP  Analysis  Advisory 
Committee  (ANAC),  and  many  others,  was  indeed  realized.  Studying  this 
design,  and  working  to  modify  it,  has  brought  to  the  ETS  staff  an  even 
greater  appreciation  of  the  elegance  of  the  original  design  of  NAEP. 


4 


22 


The  NAEP  design  until  1983  included  the  selection  of  nationally 
representative  samples  of  students  who  were  9,  13,  and  17  years  old,  and 
the  young  adult  population.  Budget  limitations  forced  the  end  of  regular 
assessments  of  the  adult  populvation  and  the  out-of-school  17-year-olds  in 
the  mid  to  late  1970s.    However,  these  populations  are  assessed 
periodically.    For  efficient  use  of  staff,  the  13-year-olds  were  assessed 
in  the  fall  of  the  school  ye?r,  the  9-yeai-olds  in  the  winter,  and  the  17- 
year-olds  in  the  spring.    Assessment  exercises  were  assigned  to  students 
using  multiple  matrix  sampling;  different  packages  of  exercises  were 
assigned  to  students  in  different  assessment  sessions,  but  the  same  package 
was  assigned  to  all  students  in  a  particular  session.    The  assessments  were 
administered  using  a  tape  recorder  to  minimize  the  effect  of  a  student's 
reading  ability  on,  say,  his  or  her  mathematics  performance.  NAEP  was 
designed  to  report  the  achievement  of  students  in  the  United  States:  as  a 
whole,  and  in  subpopulations  ruch  as  groups  based  on  regions  of  the 
country,  ethnicity,  and  gender. 

The  ECS  design  for  NAEP  and  its  modification  by  the  ETS  design  are  both 
intended  to  report  to  the  interested  public  what  students  can  and  cannot  do 
but  differ  substantially  as  to  how  to  achieve  that  purpose.    Lord  (1962), 
who  coined  the  term  "matrix  sampling",  addressed  the  problem  of  estimating 
the  proportion  of  a  population  of  persons  who  could  respond  correctly  to  a 
population  of  items,  given  a  fixed  number  of  item  responses.    He  showed 
that  a  sample  with  many  persons  responding  to  just  one  item  resulted  in  an 
estimator  with  a  smaller  standard  error  than  one  derived  from  a  sample  in 
which  fewer  persons  responded  to  many  items.  Of  course,  such  sampling  would 
not  ordinarily  be  cost-effective,  since  selecting  individuals  is  expensive, 
and  it  is  usually  possible  to  assess  a  number  of  exercises  fairly 
inexpensively  from  the  individuals  who  are  sampled.  The  ECS  conception  of 
NAEP  was  interested  in  estimating  the  proportion  of  students  who  could  pass 
particular  exercises  and  the  proportion  who  could  pass  certain  pre- 
specified  populations  of  exercises.    Consequently,  ECS'  design  for  NAEP 
used  a  cost-effective  compromise,  multiple  matrix  sampling,  which 
administered  packages  containing  about  45  minutes  of  exercises  to  the 
students  in  its  sample.    This  application  of  matrix  sampling,  however, 
meant  that  correlations  could  not  be  computed  between  exercises  in 
different  packages,  although  they  could  be  computed  between  exercises 
within  the  same  package.    Since  they  were  superfluous  to  the  ECS  approach, 
the  inter-exercise  correlations  were  seldom,  if  ever,  used  for  interpreting 
NAEP  results. 

The  ETS  conception  of  NAEP  is  hravxly  dependent  on  knowledge  of  the 
inter-exercise  correlations  for  expanded  interpretation  of  the  data.  In 
simplest  terms,  the  main  idea  is  that  if  the  items  could  be  placed  in  such 
an  order  that  a  person's  answering  an  item  correctly  at  a  particular 
difficulty  level  implied  that  he  or  she  could  answer  all  easier  items, 
knowing  the  most  difficult  exercise  a  student  could  answer  correctly  would 
imply  what  that  student  could  and  could  not  do  for  the  entire  population  of 
exercises.  Of  course,  few,  if  any,  sets  of  real  items  are  so  rigidly 
ordered,  and  such  ordering  is  clearly  impossible  where  guessing  is  allowed. 
However,  other,  less  demanding,  item  response  theory  (IRT)  models  are 


5 


2j 


available  to  be  applied  when  the  data  are  approximately  unidimensional. 
Although  the  ECS  design  was  sufficient  to  order  exercises  by  difficulty 
defined  in  terms  of  percent  passing  an  item,  the  inability  to  estimate 
inter-exercise  correlations  made  it  impossible  to  examine  whether  the 
persons  who  passed  the  more  difficult  exercises  tended  in  fact  to  be  those 
who  passed  the  easier  exercises,  and  not  otherwise.  BIB  spiralling,  a 
complex  variant  of  multiple  matrix  sampling,  was  a  feature  of  the  ETS 
design  which  facilitated  the  collection  of  inter-exercise  data  in  such  a 
way  that  dimensionality  could  be  explored • 

If  the  dimensionality  study  showed  that  the  exercises  fell 
approximately  on  a  single  dimension,  a  single  scale  could  summarize  most  of 
the  information  about  student  performance  quite  adequately.  If  the 
exercises  fell  on  more  than  one  dimension,  a  scale  for  each  dimension  would 
have  to  be  developed,  if  sufficient  data  were  available  to  support  the 
scaling  process;  otherwise,  other  summarization  procedures,  such  as  the 
average  percentages  used  by  ECS,  could  be  used-  The  1983-84  NAEP  showed 
that  a  majority  of  the  reading  exercises  could  be  adequately  fit  to  a 
unldlmensional  model  and  so  these  reading  exercises  were  scaled.    Using  the 
ordering  of  the  exercises,  the  reading  scale  was  behaviorally  anchored  so 
that  points  on  the  scale  could  be  interpreted  as  levels  of  proficiency, 
describing  what  students  at  those  levels  could  and  could  not  do.  The 
writing  exercises  were  scaled  using  an  alternative  method  which  did  not 
require  the  assumption  of  unidimensionality. 

The  Implementation  of  the  ETS  design  for  NAEP  was  net  simple;  reaching 
the  new  design  goals  has  required  some  improvisation  and  the  development  of 
new  techniques.  Although  ETS  staff  was  able  to  begin  operations  about  three 
months  earlier,  the  NAEP  grant  period  began  on  July  1,  1983  and  the 
assessment  of  students  began  in  September  of  the  same  year.    First,  the 
operational  details  of  tho  old  design  were  assimilated  and  merged  with 
those  of  the  new  design,    ilext,  the  reading,  writing,  background,  and 
attitude  questions  were  reviewed  and  reorganized.    Over  200  assessment 
booklets,  and  additional  questionnaires,  were  printed.  The  cooperation  of 
the  schools  was  enlisted.    The  students  were  assessed  and  their  data 
returned  to  ETS.    All  data  was  key  entered,  scored,  and  checked.  Then,  the 
data  analysis  began.    During  this  time,  there  was  continual  stress  between 
the  competing  goals  of  producing  reports  at  ^he  earliest  possible  moment 
and  having  the  most  carefully  and  elegantly  constructed  analysis  possible. 
Completing  a  project  of  this  magnitude  and  complexity  required  the 
dedication  of  many  experts  on  the  staffs  of  ETS  and  Westat,  Inc.  (the  ETS 
subcontractor  for  sampling  and  field  administration)  as  well  as  the  careful 
coordination  of  their  ideas  and  work. 

The  NAEP  staff,  of  course,  did  not  do  this  work  alone.  It  had  the 
policy  guidance  of  the  Assessment  Policy  Committee  (APC),  chaired  by  Wilmer 
Cody.    It  is  also  important  to  recognize  the  many  thoughtful  reviews, 
suggestions,  comments,  and  other  substantial  help  on  technical  issues  that 
the  NAEP  staff  received  from  the  highly  accomplished  members  of  its 
Technical  Advisory  Committee  (TAC),  chaired  by  Professor  Robert  Linn,  of 
the  University  of  Illinois.    Other  members  of  the  original  ETS/NAEP  TAC 
were  Professor  Robert  Glaser  of  the  University  of  Pittsburgh,  Professor 


6 


24 


Bert  Green  of  Johns  Hopkins  University,  Professor  Sylvia  Johnson  of  Howard 
University,  Professor  Melvin  Novick  of  Iowa  State  University,  and  Professor 
Richard  Snow  of  Stanford  University.    Professor  Ingram  Olkin  of  Stanford 
University  has  since  replaced  Professor  Novick  as  a  TAC  member.    The  ETS 
staff  also  received  important  help  during  the  transitional  period  from  Don 
Searls  and  other  members  of  the  ECS  staff  and  from  James  Chromy  and  others 
on  the  staff  of  the  Research  Triangle  Institute  (RTI). 

Although  thii>  report  covers  all  technical  aspects  of  the  Year  15  NAEP, 
it  may  be  useful  to  summarize  here  the  major  innovative  features  of  the 
Year  15  NAEP  and  to  compare  the  features  promised  in  the  ETS  proposal  with 
the  actuality  of  the  Year  15  assessment. 

ETS  proposed  to  modify  the  RTI  sampling  plan,  and  did.  Westat, 
inc.  modified  the  sampling  plan  to 

(1)  sample  students  in  grades  A,  8,  and  11  as  well  as 
ages  9,  13,  and  17; 

(2)  collect  data  about  students  whose  reading  and 
writing  proficiencies  could  not  be  assessed  because 
of  physical  or  other  handicap  and  who  were  excluded 
from  the  regular  assessment  sample;  and 

(3)  provide  randomly  equivalent  national  samples  for 
comparing  the  administration  procedures  of  the 
former  ECS  and  new  ETS  designs. 

ETS  proposed  to  introduce  BIB  spiralling,  a  complex  method  of 
assigning  assessment  exercises  to  students,  and  did.  The 
purpose  of  BIB  spiralling  is  to  enhance  the  ability  to 
estimate  inter-exercise  relationships.  ETS  proposed  to  spiral 
only  the  reading  exercises  but  went  further  by  spiralling 
together  reading,  writing,  background,  and  attitude  questions. 

ETS  proposed  to  collect  information  on  the  teachers, 
principals,  and  schools  of  the  sampled  students,  and  did. 

ETS  proposed  to  collect  two  equivalent  student  samples  in 
order  to  measure  the  effect  of  changing  from  administration  by 
tape  recorder  to  pencil  and  paper,  and  did. 

•  ETS  proposed  to  examine  the  dimensionality  of  its  data  in 
order  to  judge  the  appropriateness  of  scaling,  and  did. 

•  ETS  proposed  to  scale  the  reading  data,  if  appropriate,  and 
did.  The  scaling  procedure  outlined  in  the  ETS  proposal  was 
used,  but  the  data  were  found  to  be  too  sparse  at  the  level  of 
individual  respondents  for  this  type  of  analysis.  ETS  then 
developed  and  applied  other  scaling  and  analytic  procedures 
which  produced  satisfactory  results. 


7 


ERIC 


ETS  proposed  to  form  a  single  reading  scale  over  all  three 
giade/age  levels,  and  did.  Tht  resuming  NAEP  reading  scale 
spans  Grade  4/Age  9  to  Grade  11/Age  17  and  was  also  used  to 
analyze  the  data  collected  by  ECS  in  1970-71,  1974-75,  and 
1979-80. 

ETS  proposed  to  improve  the  interpretablli ty  of  the  reading 
data  by  behaviorally  anchoring  various  scale  points,  and  did. 
A  new  procedure  was  developed  to  ^hoM  what  students  at  various 
scale  points  could  and  could  not  do. 

ETS  did  not  propose  to  scale  the  writing  data,  but  did.  A  new 
method  oi  scaling  and  analysis  was  developed  for  the  writing 
data.  The  writing  scale  was  applied  to  ail  three  grade/^ge 
levels  assessed  in  1983-84  but  not  to  ECS'  previous  dati  since 
ihe  change  from  a  tape-recorded  to  a  pencil-and-paper 
administration  procedur*  seemed  to  affect  writing  responses 
substantially.    As  anticipated  in  the  ETS  proposal,  the 
writing  trend  report  was  produced  using  the  same  procedures  as 
in  the  past  whereas  the  writing  cross-sectional  report  was 
produced  using  the  new  writing  scale. 

ETS  proposed  to  form  scales  of  background  and  attitude 
questions,  and  has  done  so  to  a  small  degree.  A  general 
purpose  method  for  such  scaling  has  been  developed  and  applied 
to  some  writing  background  data,  but  the  properties  of  the  new 
method  have  not  yet  been  fully  explored. 

ETS  proposed  to  run  complex  multivariate  analyses  of  the  NAEP 
data,  but  has  not  yet  done  so  to  the  extent  envisioned. 
Appropriate  methods  for  such  analyses  with  the  NAEP  data  are 
undsr  study.  We  expect  more  development  in  this  area  in  the 
future. 

Although  the  size  of  the  grant  was  fixed  and  the  actual  reporting  of 
results  was  not  unusually  slow  compared  to  past  NAEPs  and  othei  comparable 
surveys,  the  ETS  design  resulted  in  some  substantial  extra  costs  and 
unexpected  time  delays.    One  example  of  extra  cost  is  that  of  collecting 
the  inter-exercise  information  through  BIB  spiralling.    This  method 
required  printing  and  managing  ovei  200  different  assessment  booklets; 
about  24  booklets  would  have  sufficed  for  the  ECS  design.  The  unexpected 
time  delays  resulted  largely  from  what  was  essentially  the  research  nature 
of  this  first  application  of  the  ETS  design  to  reading  and  writii.^-^  when 
empirical  results  did  not  support  the  proposed  analysis  procedures,  we 
developed  and/or  applied  procedures  which  were  more  appropriate. 
Presumably,  this  research  aspect  of  the  work  will  be  greatly  reduced  in 
future  assessments  of  reading  and  writing. 

*  *  * 

The  Year  15  NAEP  staff  was  greatly  concerned  not  only  with  the  accuracy 
of  its  results  but  withi  making  its  public-use  data  tapes  available  in  a 


8 


26 


format  which  would  be  as  easy  for  others  to  use  as  possible.  The  purpose  of 
the  public-use  data  tapes  is  to  allow  others  to  check  our  analyses,  to 
perform  alternate  analyses  using  different  methods,  and  to  perform  analyses 
for  other  purposes •  The  public-use  data  tapes  are  already  available  for  the 
Year  15  data,  as  they  are  for  all  previous  assessments,  and  contain  all 
student,  teacher,  and  school  data  that  were  collected,  except  that 
information  whose  availability  would  risk  the  confidentiality  of  the 
subjects.    The  public-use  data  tapes  are  formatted  for  and  have  parameter 
statements  for  the  commonly  used  statistical  systems  SPSS  and  SAS. 

The  dual  goals  of  accuracy  and  ease  of  use  have  affected  the 
construction  of  the  database.    Several  points  are  worth  noting. 

It  is  impossible  to  make  a  database  as  complex  as  NAEP's  completely 
simple  to  use.    A  secondary  user  cannot  use  the  database  effectively 
without  some  knowledge  of  the  NAEP  design.    For  example,  sampling  by  grade 
and  age  forces  the  user  to  consider  which  subsample  is  appropriate  for  a 
particular  analysis.    BIB  spiralling  results  in  a  substantial  amount  of 
data  which  is  missing  by  design  (about  90  percent!);  thus,  the  user  must 
think  carefully  about  missing  data  procedures.    Although  we  have  tried  to 
make  the  public-use  data  tapes  as  easy  to  use  as  possible,  thei    use  will 
require  some  investment  in  understanding  NAEP. 

Two  features  of  the  tapes  give  the  user  additional  analytic  power. 
Most  complex  surveys  require  sampling  weights  to  achieve  proper  population 
estimates,  and  the  weights  are  supplied  for  use  in  analysis.    This  has  been 
done  for  NAEP.    However,  with  a  complex  sampling  design,  the  weighted 
versions  of  standard  formulas  for  independent  and  identically  distributed 
variables  are  not  appropriate  for  estimating  sampling  errors;  while 
appropriate  formulas  can  be  developed,  they  are  complex  to  apply.  Some 
other  method  based  on  pseudo-replicates,  such  as  the  jackknife,  is 
appropriate  and  simple  in  application.    We  have  developed  and  applied  one 
form  of  the  jackknife  method,  which  we  used  in  all  NAEP  analyses.  It 
requires  32  sampling  weights  for  each  student  in  addition  to  the  sampling 
weight  usually  supplied.    All  of  these  weights  are  available  on  the 
public-use  data  tapes  in  a  way  that  makes  possible  the  approximate 
estimation  of  sampling  error  using  standard  statistical  systems  as  opposed 
to  specialized  software  designed  for  survey  data.    Since  this  ability  comes 
with  the  cost  of  more  computing  time,  the  secondary  user  may  use  this  new 
ability  or  not,  as  he  or  she  deems  appropriate. 

The  other  feature  of  the  public-use  data  tapes  is  that  they  exceed  the 
standard  practice  of  providing  only  raw  data  by  also  providing  derived 
variables  for  reading  and  writing.    The  complexity  of  the  IRT  scaling 
analysis  prompted  this  inclusion.    The  underlying  rationale  follows. 

The  item-sampling  designs  that  have  characterized  NAEP  since  its 
inception  provided  efficient  estimates  for  average  levels  of  performance  in 
groups  of  students,  but  are  too  sparse  to  yield  accurate  estimates  for 
individual  students.  Until  now,  NAEP  reported  only  estimates  of  the 
proportions  of  students  who  could  answer  individual  items  or  sets  of  items 
correctly,  avoided  estimating  student  proficiency  distributions,  and  did 


ERLC 


27 


not  make  individual  proficiency  measures  available  to  the  secondary  user. 
The  lack  of  individual  proficiency  measurements  encumbered  analyses  of  the 
relationships  between  proficiency  and  student  characteristics.  Regrettably, 
it  is  common  in  educational  surveys  to  carry  out  these  latter  analyses  with 
poorly  estimated  scores  for  individuals,  despite  the  demonstrable 
invalidity  of  their  results  (see  Goldstein,  1980) • 

Recent  developments  in  item  response  theory,  in  statistical  estimation 
procedures,  and  in  methodologies  for  handling  missing  data  make  it  possible 
for  the  first  time  to  estimate  accurately  student  proficiency  distributions 
and  their  relationships  with  background  variables  from  complex,  sparse 
sampling  designs.  The  embodiment  of  these  advances,  the  derived  variables 
called  "plausible  values"  for  reading  and  writing,  were  constructed  to 
yield  consistent  estimates  of  such  population  characteristics  for  the  NAEP 
populations  as  a  whole,  and  for  the  subpopulations  defined  by  the 
traditional  NAEP  reporting  categories.  The  intricacies  and  expense  involved 
in  obtaining  optimal  estimates  from  such  a  complex  database  may  prove 
prohibitive  to  most  secondary  analys-S,  however,  and  the  plausible  values 
mentioned  above  are  therefore  provided  for  exploratory  analyses  involving 
other  background  variables  as  well.  Chapters  10  and  11  provide  details  on 
the  construction  and  properties  of  plausible  values  and  caveats  on  their 
use. 


*  *  * 


This  technical  report  presents  the  details  of  how  the  assessment  was 
accomplished,  from  the  <?  velopment  of  the  exercises  through  the  analyses  of 
the  data.  The  report  is  organized  into  three  parts: 

*  Part  I  explains  the  steps  in  the  process  of  developing  the 
basic  data.    Part  I  begins  with  an  overview,  followed  by 
chapters  covering  the  development  of  the  reading  and  writing 
exercises;  the  sampling;  the  assignment  of  exercises  to 
students;  a  summary  of  the  instruments;  the  field 
administration  (including  attainment  of  school  cooperation); 
and  the  data  entry,  exercise  scoring,  and  construction  of  the 
NAEP  database  and  public-use  data  tapes.    Quality  control  is 
covered  throughout  Part  I, 

*  Part  II  explains  the  steps  involved  in  data  analysis.  This 
part  also  begins  with  an  overview.    The  next  chapters  include 
discussions  of  the  scaling  and  analysis  of  the  reading 
exercises,  the  writing  exercises,  and  the  background  and 
attitude  questions;  weighting  and  parameter  estimation, 
including  the  estimation  of  uncertainty  due  to  sampling  and 
measurement  error;  and  the  validity  of  the  NAEP  data.  The 
final  chapter  of  Part  II  discusses  the  use  of  the  standard 
tabulations  of  NAEP  results. 


10 


28 


Part  III  presents  some  estimates  of  the  proficiency  of  the 
students  in  American  schools.    First,  estimates  of  the  numbers 
of  students  at  ages  9,  13  and  17  and  at  grades  4,  8  and  11  are 
given,  as  well  as  estimates  for  different  genders, 
racial/ethnic  groups  and  other  subpopulations.  Then, 
estimates  of  the  various  points  in  the  distributions  of 
reading  and  writing  proficiency  are  presented.  Finally, 
estimates  of  average  values  on  the  reading  and  writing  scales 
are  given  for  a  number  of  cross-classifications  of  students. 


The  organizational  strategy  for  this  report  is  to  first  present 
overviews  of  the  two  components  of  NAEP,  design  and  analysis.  These 
overviews  direct  the  reader  to  chapters  where  details  are  provided.  Each 
chapter  begins  with  a  summary,  then  presents  a  detailed  exposition  of  its 
topic.    In  some  cases,  chapters  refer  to  appendices  or  supplementary 
documents  which  contain  even  more  detail.    This  strategy  has  been  adopted 
to  aid  the  reader  in  reaching  areas  of  special  interest.    The  reader  who 
wishes  only  a  summary  may  read  just  the  overviews  (Chapters  2  and  9). 

We  have  intended  to  include  in  this  report  all  of  the  avenues  we  have 
pursued,  whether  successfully  or  not,  and  have  succeeded  to  some  degree  in 
doing  so.    This  approach  has  been  adopted  to  help  readers  understand  the 
rationale  for  what  was  finally  done  and  to  prevent  them  from  entering  the 
same  blind  alleys  in  the  future.    Where  detailed  descriptions  of  unfruitful 
avenues  are  available,  they  have  been  included;  where  the  wrong  paths  would 
be  unduly  expensive  to  document,  they  have  been  alluded  to.    We  have  also 
included  some  comments  on  what  we  would  do  differently  if  we  could  begin 
the  design,  data  collection,  or  analysis  again. 

The  chapters  are  separately  authored  and  differ  somewhat  in  style  and 
point  of  view.    In  most  cases,  the  person  most  responsible  for  the  activity 
was  assigned  the  writing  task.    We  hope  that  the  chapters  can  be  read 
independently,  after  the  a^^propriate  introductions  are  read.    Although  we 
have  tried  to  cross-reference  where  necessary,  the  method  of  organization 
results  in  some  redundancy  from  chapter  to  chapter. 


11 


2d 


Chapter  2 


OVERVIEW  OP  PART  I: 
THE  DESIGN  AND  IMPLEMENTATION 
OP  YEAR  15  (1983--84)  NAEP^ 


Albert  E,  Beaton 
Educational  Testing  Service 


This  introduction  to  Part  I  of  the  technical  report  provides  an 
overview  of  the  processes  by  which  the  NAEP  Year  15  data  evolved  from  the 
planning  stage  into  a  database  ready  for  analysis.  The  major  components  of 
this  NAEP,  with  few  details,  are  presented  here  with  pointers  to  the 
succeeding  chapters  which  contain  more  information.  Although  the  remaining 
chapters  in  this  part  of  the  Technical  Report  contain  most  of  the  important 
details  about  each  topic,  some  of  the  chapters  themselves  direct  the  reader 
to  even  greater  detail  to  be  found  in  appendices  and  supplementary 
documents.  The  organization  of  the  report  is  intended  to  help  an  interested 
reader  locate  the  areas  of  greatest  interest  to  him  or  her,  then  study 
those  areas  in  as  much  depth  as  necessary  to  understand  the  procedures  and 
considerations  involved  in  the  collection  of  data.    From  this  report,  it  is 
expected  that  the  reader  will  be  able  to  judge,  for  himself  or  herself,  the 
quality  or  the  data,  their  strengths  and  their  weaknesses. 

This  chapter,  and  this  part  of  the  technical  report,  does  not  include  a 
discussion  of  the  procedures  used  in  data  analysis;  the  methods  of  data 
analysis  are  summarized  in  the  introduction  to  Part  II  of  this  report,  and 
then  discussed  in  detail  in  succeeding  chapters.    Also,  the  chapter  does 
not  include  the  substantive  results  of  the  NAEP;  those  results  are 
published  separately  in  reports  such  as  The  Reading  Report  Card:  Progress 
Toward  Excellence  in  Our  Schools  (1985),  etc. 

Section  2.1  of  this  introduction  provides  a  brief  summary  of  the  design 
of  the  Year  15  NAEP,  focusing  on  differences  between  the  new  model  for  NAEP 
and  the  model  which  preceded  it.    The  exposition  of  the  design  is  brief; 
the  ETS  design  is  covered  extensively  in  another  report,  A  New  Design  for  a 
New  Era  (Messick,  Beaton,  &  Lord,  1983). 

To  provide  background.  Section  2.2  presents  the  NAEP  assessment 
schedule  from  the  first  year  of  data  collection  in  1969  to  the  Year  15 


The  author  wishes  to  thank  Bruce  Kaplan,  Ira  Sample,  and  Laurie 
Barnett,  who  produced  the  tables  used  in  this  chapter. 


13 


30 


assessment.  The  assessments  in  progress  or  planned  through  1987-88  are  also 
mentioned. 

Sections  2.3  through  2.8  follow  the  sequence  of  the  remaining  chapters 
in  Part  I  of  the  technical  report: 


the  development  of  the  reading  and  writing  exercises  and  the 
processes  by  which  they  were  reviewed  (Chapter  3); 

the  four-stage  stratified  random  sampling  procedure  used  in 
the  NAEP  (Chapter  4); 

the  assignment  of  the  NAEP  cognitive  and  other  exercises  to 
students  selected  for  the  sample  (Chapter  5); 

a  description  of  the  instruments  and  an  overview  of  the  items 
(Chapter  6); 

the  field  administration  procedures,  including  the  training  of 
the  field  administrators,  attaining  school  cooperation, 
assessment  administration,  and  quality  control  (Chapter  7); 
and 

the  flow  of  data  from  their  receipt  at  ETS  through  data  entry, 
professional  scoring,  and  entry  into  the  database  in  final 
form,  ready  for  analysis  (Chapter  8). 


In  addition.  Section  2.9  presents  a  statistical  summary  of  the  data 
that  were  collected  in  Year  15. 

The  data  collected  in  the  Year  15  NAEP  are  now  ready  for  public  use  in 
the  form  of  a  set  of  public-use  data  tapes,  documented  in  the  NAEP  1983-84 
Public-Use  Data  Tapes  Version  3.1  Users^  Guide  (Barone,  Norris,  &  Rogers, 
1986).    These  tapes  contain  the  available  data  for  the  sampled  students, 
their  teachers,  principals,  and  schools. 


2.1    The  Design  of  the  Year  15  Assessment 

To  understand  the  design  of  the  Year  15  assessment,  it  is  first  helpful 
to  review  the  previous  design  employed  by  the  Education  Commission  of  the 
States  (ECS).    As  noted  in  A  New  Design  for  a  New  Era,  the  ECS  design  was 
brilliantly  responsive  to  the  demands  of  its  times,  and  the  ECS  staff  and 
consultants  deserve  substantial  praise  for  the  elegance  and  efficiency  of 
that  design.    Because  of  possible  variations  in  the  definition  of  "grade" 
in  different  school  systems,  the  ECS  design  called  for  sampling  ages 
instead  of  grades.    One  of  NAEP's  aims  was  to  measure  performance  over  a 
broad  range  of  exercises,  while  requiring  not  more  than  about  45  minutes  of 
a  student's  time;  thus,  matrix  sampling  was  used.    To  avoid  the  possible 
confounding  of  achievement  in  areas  such  as  mathematics  with  the  ability  to 


14 

31 


read  the  questions  and  direction?,  tape  recorders  were  used  to  present 
instructions  and  exercises.    The  intended  result  of  an  assessment  was  an 
estimated  percent  of  students  who  could  perform  successfully  on  each 
exercise.    The  estimated  percents  would  also  be  presented  separately  for 
different  geographic  regions,  genders,  races,  community  types,  and  other 
subgroups.    The  ECS  staff  quickly  became  aware  that  the  users  of  their 
reports  wanted  a  summarization  of  the  massive  amount  of  exercise-by- 
exercise  information  and  thus  moved  to  reporting,  additionally,  the  mean 
percentages  correct  over  logically  homogeneous  subsets  of  exercises  and, 
ultimately,  over  all  exercises  within  a  subject  area. 

Nevertheless,  the  ETS  staff  felt  that  the  original  design  could  be 
modified  to  make  NAEP  results  easier  to  understand  and  use,  and  proposed  a 
major  re-design  of  NAEP.    ETS  decided  to  gather  samples  by  both  age  and 
grade  because  sampling  only  by  age  made  the  assessment  results  not  directly 
relevant  to  school  policies,  which  are  usually  established  by  grade  level. 
Additionally,  the  tape  recorders  set  the  assessment  apart  from  all  other 
testing  programs,  so  the  national  data  from  other  testing  programs  could 
not  be  used  for  comparative  purposes  without  administering  those  exercises 
using  a  tape  recorder.  The  tape  recorder  had  also  resulted  in  the 
requirement  that  all  students  at  an  assessment  s'^ssion  respond  to  the  same 
exercises  at  the  same  moment,  thus  creating  a  less  efficient  sampling 
design.    ETS  proposed  administration  by  printed  instructions  which  would 
allow  it  to  "spiral"  different  tests  into  an  assessment  session.  ETS  also 
introduced  scaling  to  enhance  the  comparability  of  results  over  different 
assessment  forms  and  with  an  evolving  exercise  pool. 

ETS  was  sensitive  to  the  great  wealth  incorporated  in  the  data  that  had 
been  collected  during  the  previous  fourteen  years;  data  had  been  collected 
on  over  a  million  students  in  eleven  subject  areas.    Any  radical  change 
which  in  effect  made  the  old  data  unusable  would  not  be  acceptable;  thus, 
ETS  proposed  to  run  parallel  assessments  in  each  subject  area,  one 
assessment  using  the  past  tape  procedures  and  the  other  using  the  new 
printed  instructions.  The  samples  for  these  two  assessments  were 
equivalent;  thus,  differences  between  the  two  methods  could  be  attributed 
to  administration  differences  and  sampling  error. 


2.2    Assessment  Schedule 

The  schedule  of  assessments  up  to  Year  15  is  shown  in  Table  2(1).  As 
this  table  illustrates,  the  subject  areas  assessed  have  included  reading, 
writing,  mathematics,  science,  and  social  studies,  as  well  as  citizenship, 
literature,  art,  music,  and  career  development.    Assessments  were  conducted 
annually  through  1980  and  have  been  conducted  biennially  since  then.  Many 
subject  areas  have  been  re-assessed  periodically  to  determine  trends  in 
achievement  over  time.    Since  its  inception,    NAEP  has  assessed 
9-year-olds,  13-year-olds,  and  in-school  17-year-olds.  The  assessment  of 
out-of-school  17-year-olds  and  young  adults  was  dropped  because  of  budget 
restrictions.  To  date,  NAEP  has  assessed  approximately  1,300,000  young 
Americans. 


15 


3/. 


Table  2(1) 


NAEP  Learning  Areas,  Grades,  and  Ages  Assessed:  1969-1984 


ASSESSMENT  YEAR 

LEARNING  AREAS 

GRADES/AGES  ASSESSED* 

Grade 
4 

Age 
9 

Grade 
8 

Age 
13 

Grade 
11 

Age 
17IS 

Age 
170S 

Age 
ADULT 

Year  1/1969-70 

Science 

X 

X 

X 

X 

X 

Wri  ting 

X 

X 

X 

X 

X 

(yitizensnip 

X 

X 

X 

X 

X 

Year  2/1970-71 

Reading 

X 

X 

X 

X 

X 

Literature 

X 

X 

X 

X 

X 

Year  3/1971-72 

Music 

X 

X 

X 

X 

X 

Social  Studies 

X 

X 

X 

X 

X 

Year  4/1972-73 

Science  (2) 

X 

X 

X 

X 

X 

Mathematics 

X 

X 

X 

X 

X 

Year  5/1973-74 

Career  and 
Occupational 
Development 

X 

X 

X 

X 

X 

writing  {1} 

X 

X 

X 

X 

Year  6/1974-75 

Reading  {l) 

X 

X 

X 

X 

Art 

X 

X 

X 

X 

Year  7/1975-76 

Citizenship/ 
Social 
studies  (l) 

X 

X 

X 

X 

Mathematics** 

X 

X 

X 

Year  8/1976-77 

Science  (3) 

X 

X 

X 

Basic  Life 
Skill?** 

X 

Health** 

X 

Energy*^ 

X 

Reading**  (3) 

X 

Science**  (3) 

X 

lear  9/i9//-/o 

Mathematics  (^) 

X 

X 

X 

Consumer 
Skills** 

X 

Year  10/1978-79 

Art  (2) 

X 

X 

X 

Music  (2) 

X 

X 

X 

Writing  (3) 

X 

X 

X 

Year  11/1979-80 

Reading  (4) 

X 

X 

X 

X 

Literature  (2) 

X 

X 

X 

X 

V^A*-    10/1  C\Of\  01 

Year  l^/19o0-ol 

No  Data  Co] 

.lection 

Year  13/1981-82 

Mathematics  (3) 

X 

X 

X 

Citizenship/ 
Social 
Studies  (3) 

X 

X 

X 

Science**  (4) 

X 

X 

X 

Year  14/1982-83 

No  Data  Co] 

-lection 

Year  15/1983-84 

Reading  (5) 

X 

X 

X 

X 

X 

X 

Writing  (4) 

X 

X 

X 

X 

X 

X 

*       17IS  denotes  17-year-olds  enrolled  in  public  or  private  schools;  170S  denotes 

17-year-olds  who  dropped  out  of  school  or  graduated  prior  to  the  assessment. 
**      Small,  special-interest  assessments  conducted  on  limited  samples  at  specific  ages 
(  )    Second  and  subsequent  assessments  of  a  learning  area 


16 

33 


x983-84  was  a  transition  year  for  NAEP,  The  Education  Commission  of  the 
States  (ECS)  had  the  role  of  deciding  the  subject  areas  to  be  measured, 
reading  and  writing,  and  developing  the  exercises  and  background  and 
attitude  items  to  be  administered.  The  Research  Triangle  Institute  (RTI), 
the  sampling  and  field  administration  subcontractor  for  ECS,  had  the  role 
of  selecting  the  sample  of  schools.  The  Educational  Testing  Service  (ETS) 
prepared  the  assessment  booklets  according  to  its  design,  prepared  the  data 
for  analysis,  and  analyzed  the  data,    Westat,  Inc,,  the  sampling  and  field 
administration  subcontractor  for  ETS,  modified  and  extended  the  sample  and 
administered  the  assessment  to  the  sampled  students- 
Table  2(1)  also  indicates  the  initiation  in  Year  15  of  data  collection 
by  grade  as  well  as  by  age. 

Assessments  through  1988  are  either  in  progress  or  in  the  planning 
stage.  In  Year  16  (1984-85),  a  separately  funded  assessment  of  the  literacy 
of  young  adults  was  administered,  the  results  of  which  have  been  published 
^"  Literacy;    Profiles  of  Americans  Young  Adults,  Final  Report  (Kirsch  & 
Jungeblut,  198o;    This  survey  also  collected  a  small  sample  of  out-of^ 
school  17-year-olds.  The  Year  17  (1985-86)  assessment  includes  reading, 
mathematics,  science,  and  computer  competence,  with  a  special  probe  of  U.S. 
history  and  literature  for  the  older  students.    Current  plans  call  for  the 
assessment  of  reading,  writing,  citizenship,  and  U.S.  history  in  Year  19 
(1987-88). 


2-3    The  Development  of  NAEP  Measurement  Instruments 

The  Year  15  NAEP  assessed  the  performance  of  American  students  in  the 
learning  areas  of  reading  and  writing.    In  addition,  a  large  number  of 
background  and  attitude  questions  were  surveyed.  Information  was  also 
collected  from  the  students'  principals  and  teachers. 

The  development  of  the  reading  exercises,  writing  exercises,  and 
background  and  attitude  items  was  the  responsibility  of  ECS.    ECS  gave  to 
ETS  a  large  number  o£  exercises,  more  than  could  be  used,  and  ETS  selected 
the  items  that  were  actually  administered.  The  detailc  of  the  development 
of  the  exercises  are  provided  in  Chapter  3, 

From  its  inception,  NAEP  has  developed  assessments  through  a  consensus 
process.    Educators,  scholars,  and  citizens  representative  of  many  diverse 
constituencies  and  points  of  view  design  objectives  for  each  subject  area 
assessment,  proposing  general  goals  they  feel  students  should  achieve  in 
the  course  of  their  education.    After  careful  reviews,  the  objectives  are 
given  to  item  writers,  who  develop  assessment  questions  appropriate  to  tne 
objectives. 

All  exercises  undergo  extensive  reviews  by  subject-matter  and 
measurement  specialists,  as  well  as  careful  scrutiny  to  eliminate  any 
potential  bias  or  lack  of  sensitivity  to  particular  groups.  Some  of  the 
questions  used  in  each  assessment  are  made  available  to  anyone  interested 


17 


in  studying  or  using  them.    The  remainder  have  traditionally  been  kept 
secure  for  use  in  future  assessments  for  the  examination  of  trends  over 
time. 

The  reading  assessment  contained  multiple  choice,  short  open-ended,  and 
essay  exercises.  The  reading  essay  exercises  were  professionally  scored. 
All  writing  exercises  required  that  the  student  write  an  essay,  and  these 
essays  were  also  professionally  scored.    The  professional  scoring  is 
described  in  Chapter  8.2. 

In  recent  assessments,  NAEP  has  asked  numerous  background  and  attitude 
questions  to  improve  the  usefulness  of  NAEP  achievement  results  and  provide 
the  opportunity  to  examine  policy  issues.    Students,  teachers  and  school 
officials  answer  a  variety  of  questions  about  instruction,  activities, 
experiences,  curricula,  resources,  attitudes  and  demographics. 

2.4    The  NAEP  Sample 

The  sampled  populations  consisted  of  all  in-school  students  who  were 
9,  13,  or  17  years  old  or  who  were  in  the  4th,  8th,  or  11th  grades  in  the 
50  states  and  the  District  of  Columbia.  Both  public  and  private  school 
students  were  sampled. 

The  sample  is  a  four-stage  probability  sample.    The  stratified  sample 
of  first-stage  units  and  of  schools  within  selected  first-stage  units  was 
developed  and  selected  by  the  Research  Triangle  Institute,  using  sample 
sizes  specified  by  Westat,  Inc.    The  third  and  fourth  stages  of  sampling, 
involving  the  assignment  of  sessions  to  schools  and  the  selection  of 
students  were  designed  and  implemented  by  Westat,  Inc.    The  Year  15  sample 
design  is  described  in  detail  in  Chapter  3  and  in  Westat's  Report  on  Sample 
Selection,  Weighting,  and  Variance  Estimation:    NAEP— Year  15  (Lago,  Burke, 
Tepping,  &  Hansen,  1985).    The  four  stages  are  described  below.  The 
details  of  the  sampling  procedure  can  be  found  in  Chapter  4. 

Stage  1:    Primary  Sampling  Units 

In  the  first  stage  of  sampling,  the  United  States  was  divided 
into  geographical  units  comprised  of  counties  or  groups  of 
contiguous  counties,  which  met  a  minimum  school  enrollment  size. 
These  units,  called  primary  sampling  units  (PSUs),  were  classified 
into  20  strata  which  were  defined  by  region  (Northeast,  Southeast, 
Central,  and  West)  and  by  the  sample  description  of  community  (Big 
City,  Fringe  of  Big  City,  Medium  City,  Small  Place,  and  Extreme 
Rural).  A  sample  of  64  PSUs  was  then  selected  (without 
replacement)  to  represent  all  regions  and  sizes  of  communities 
with  probability  proportional  to  population  size  measures. 


18 


ERiC  35 


Stage  2:    Sampling  Schools 


In  the  second  stage  of  sampling,  the  frame  consisted  of  a  file 
of  schools  obtained  from  Quality  Education  Data,  Tnc.  (QED).  The 
file  included  public,  private.  Catholic,  Bureau  of  Indian  Affairs, 
and  Department  of  Defense  schools,  listed  according  to  the  three 
grade/age  groups  within  each  of  the  64  PSUs.  The  NAEP  grade/age 
groups  were  Grade  4/Age  9,  Grade  8/Age  13,  and  Grade  11/ Age  17. 

To  allow  sampling  of  extreme-low  socio-economic  status  (SES) 
big-city  schools  at  a  double  sampling  rate,  schools  within  big- 
city  PSUs  were  stratified  by  SES  and  their  estimated  sizes  were 
doubled.    Extreme-rural  schools  were  also  oversampled  by  a  factor 
of  two. 

Schools  within  each  PSU  were  selected  (without  replacement) 
with  probabilities  proportional  to  assigned  measures  of  size. 
Roughly  equal  measures  of  size  were  assigned  to  schools  containing 
estimates  of  age-eligible  students  ranging  from  20  to  160  (for  age 
9),  or  20  to  200  (for  ages  13  and  17^.    Schools  above  the 
indicated  maximum  size  were  selected  with  probabilities 
proportional  to  the  number  of  age-eligible  stU''--nts.    Schools  with 
less  than  20  estimated  age  eligibles  were  assigned  considerably 
lower  measures  of  size,  since  they  had  higher  per-student 
administrative  costs. 


Stage  3:  Assignment  of  Sessions  to  Schools,  by  Type 

The  assignment  of  sessions  to  schools  served  as  the  third 
stage  of  sampling.    This  assignment  was  done  separately  by  the  two 
types  of  sessions,  designated  "spiral"  and  "tape",  which  represent 
separate  samples  of  the  population  of  students.    The  Year  15  tape 
sample  contains  students  of  specified  ages  (who  could  be  of  any 
grade).    The  Year  15  spiral  sample  contains  the  students  who 
received  either  BIB  or  UBIB  booklets  (see  below)  and  represents 
two  overlapping  samples.    The  first  sample  represents  students  of 
specified  ages  (who  could  be  in  any  grade)  and  is  comparable  to 
the  samples  from  previous  NAEP  assessments.    It  is  also  randomly 
equivalent  to  the  samples  of  students  who  were  administered  the 
tape  booklets  in  the  Year  15  assessment.    The  second  sample 
represents  students  of  specified  grades  (who  could  be  of  any  age). 

For  tape  assessments  there  were  four  distinct  booklets  al 
each  age  class,  each  of  which  was  to  be  administered  once  within 
each  of  the  PSUs.    To  assure  that  no  tape  session  would  include  a 
very  small  number  of  students,  small  schools  were  clustered  with 
other  schools  in  the  same  PSU  so  as  to  form  clusters  with  an 
estimated  minimum  of  eight  eligibles.    Tape  sessions  were  then 
assigned  within  each  PSU  by  ordering  schools  or  school  clusters  by 
socioeconomic  status  and  size  and  then  selecting  a  systematic 
sample  of  four  schools  at  each  age  with  probability  proportional 


19 


3d 


to  the  estimated  number  of  age  eligibles  within  the  sch' ol  (or 
school  cluster). 

One  spiral  session  was  assigned  to  each  school  or  school 
cluster  vhich  was  not  selected  for  a  tape  session.    The  balance  of 
the  spiral  sessions  were  then  assigned  to  schools  (and  school 
clusters)  at  a  rate  approximately  proportional  to  the  estimated 
number  of  eligible  students,  by  age  or  grade,  that  would  be 
available  after  the  initial  assignment  of  tape  and  spiral 
sessions. 


Stage  4:  Sampling  Students 

In  the  fourth  stage  of  sampling,  a  consolidated  list  of  all 
grade-  and  age-eligible  students  was  established  for  each  selected 
school.    A  systematic  selection  of  eligible  students  was  made  and 
students  were  assigned  to  spiral  or  tape  sessions,  depending  on 
whether  the  assessment  was  to  be  administered  by  pencil  and  paper 
or  by  tape  recorder. 

Some  students  were  deemed  unassessable  by  the  school 
authorities  because  they  did  not  speak  English,  were  judged  as 
being  educable  mentally  retarded,  or  were  functionally  disabled. 
In  these  cases,  a  questionnaire  was  filled  out  by  the  school  staff 
listing  the  reason  for  excluding  the  student  and  providing  some 
background  information. 


Sampling  Principals  and  Teachers 

A  Principal  Questionnaire,  distributed  to  each  sampled  school 
by  Westat  prior  to  the  assessment,  was  used  by  Westat  to  obtain 
both  an  up-to-date  estimate  of  grade/age-eligible  students  and 
information  on  minority  enrollment. 

The  School  Characieristics  and  Policy  Questionnaire  and 
Teacher  Questionnaire  were  distributed  in  every  sampled  school. 
The  School  Characteristics  and  Policy  Questionnaire  was  mailed  to 
the  school  by  Westat  prior  to  the  assessment  and  picked  up  by  the 
Westat  supervisor,  then  returned  to  ETS. 

The  Teacher  Questionnaire  was  administered  to  the  teachers  of 
a  subsample  of  the  students  sampled  for  spiral  sessions.  The 
purpose  of  this  sample  was  to  estimate  the  number  (proportion)  of 
students  whose  teachers  had  various  attributes,  not  the  attributes 
of  the  teacher  population.    Therefore,  statements  like  "20  percent 
of  students  have  teachers  who  have..."  are  appropriate  in 
discussing  Teacher  Questionnaire  data,  but  statements  like  "20 
percent  of  teachers  have..."  are  not. 


20 


ERIC 


37 


The  number  of  teachers  sampled  was  equal  to  the  number  of 
spiral  sessions  conducted  in  the  school.  Thus,  if  there  were  six 
spiral  sessions  conducted  in  a  school,  a  subsample  of  six  students 
was  selected  and  the  school  coordinator  was  asked  to  identify  the 
English  or  Language  Arts  instructor  for  each  student.  These 
instructors  completed  the  Teacher  Questionnaires.  Please  note 
that,  since  a  number  of  students  may  have  had  the  same  teacher, 
and  some  teachers  did  not  complete  the  questionnaire,  the  number 
of  students  in  the  subsample  for  whom  teacher  information  is 
available  is  greater  than  the  number  of  teachers  who  completed 
questionnaires  in  a  given  school. 


2.5    Assigning  NAEP  Exercises  to  Students 

After  the  student  sample  was  selected,  it  was  necessary  to  assign 
booklets  of  exercises  to  them.  The  ETS  design  for  NAEP  greatly  affected  the 
way  in  which  the  assessment  booklets  were  organized  and  constructed.  The 
assignment  of  booklets  depended  on  whether  the  student  was  in  the  spiral  or 
the  tape  samples.    A  detailed  discussion  of  this  topic  can  be  found  in 
Chapter  5. 


2.5.1    Spiral  Booklets 

The  spiral  sample  is  so  called  because  the  assessment  booklets  were 
spiralled  within  an  assessment  session,  that  is,  different  booklets  were 
interleaved  so  that  different  students  in  the  same  assessment  session  were 
asked  to  respond  to  different  exercises.  With  spiralling,  the  instructions 
to  the  students  and  the  exercises  themselves  must  be  read  by  the  student 
from  his  or  her  booklet  since  administration  using  a  tape  recorder  would  be 
unmanageable  with  more  than  one  type  of  booklet  in  an  assessment  session. 
The  purpose  of  spiralling  was  to  increase  sampling  efficiency. 

The  targeted  sample  size  was  for  2,000  students  to  respond  to  each 
exercise  at  each  age  or  grade  level  in  the  spiral  sample;  this  target 
implied  a  sample  of  2,600  at  each  grade/age. 

The  reading  and  writing  exercises  were  sorted  into  units  called  blocks 
which  were  designed  to  take  a  student  fourteen  minutes  to  complete.  The 
fourteen  minutes  included,  on  the  average,  twelve  minutes  of  either  reading 
or  writing  exercises  and  two  minutes  of  background  and  attitude  questions. 
Altogether,  there  were  21  such  blocks  of  exercises  created  for  each 
grade/age  level.  Three  double-length  (28-minute)  blocks  were  also 
developed,  making  24  blocks  per  grade/age  combination.  Some  blocks  were 
administered  at  more  than  one  age  and  grade. 

The  spiral  sample  can  be  divided  into  two  parts:     the  BIB  and  UBIB 
saraples. 

The  BIB  (Balanced  Incomplete  Block)  sample  was  created  to  meet  the 
design  goal  of  facilitating  the    estimation  of  inter-correlations  or  other 


21 


3o 


statistics  among  the  assessment  exercises.  Using  a  BIB  design,  a  laige 
number  of  booklets  were  created  in  such  a  way  that  each  pair  of  exercises 
was  administered  to  a  randomly  equivalent  subsarnple  of  students  while 
maintaining  the  goal  of  2,000  students  for  each  exercise  at  both  age  and 
grade  levels.  For  the  BIB  part  of  the  spiral  sample,  57  assessment  booklets 
were  assembled  for  each  grade/age  level.    Each  booklet  began  with  one  six- 
minute  block  which  contained  only  background  questions  which  was  followed 
by  three  fourteen-minute  blocks  containing  a  combination  of  cognitive 
exercises  and  background  and  attitude  questions. 

The  UBIB  (Unbalanced  Incomplete  Block)  part  of  the  spiral  sample  was 
developed  to  accommodate  several  long  exercises  which  could  not  fit  into 
the  14-minute  blocks,  thus  necessitating  the  development  of  three  double- 
length  blocks.    Because  rf  these  three  blocks,  six  more  booklets,  called 
UBIB  booklets,  were  created  using  an  unbalanced  design.  The  UBIB  booklets 
began  with  the  same  common  background  section,  which  was  followed  by  one 
double-length  and  one  single-length  block.    Since  it  was  not  possible  to 
pair  the  double-length  blocks  with  each  other  within  the  available  student 
time,  some  of  the  single-length  blocks  used  in  the  UBIB  booklets  were  also 
used  in  the  BIB  booklets  and  thus  are  available  for  selected 
inter-correlations. 

The  booklets  developed  using  the  BIB  and  UBIB  designs  were  interleaved 
into  the  spiral  sample  in  bundles  of  23  in  a  randomly  selected  order.  Each 
of  the  57  BIB  and  6  UBIB  booklets  were  placed  in  the  bundles  in  such  a  '•'?»y 
that  the  estimated  number  of  students  receiving  each  block  was  at  least 
2,600  per  grade/age.  Each  booklet  had  the  appropriate  probability  of  being 
at  each  position  within  a  bundle. 

The  bundles  were  distributed  in  assessment  sessions  within  a  school  so 
that,  in  almost  all  instances,  no  two  students  in  a  session  were  given  the 
same  booklet. 


2.5.2    Tape  Booklets 

Another  design  feature  of  the  new  NAEP  was  the  collection  of  bridge 
samples  by  administering  some  of  the  NAEP  items  with  tape  recordings  and 
age-only  simple  matrix  sampling  as  had  been  done  in  past  assessments.  The 
purpose  of  these  samples  was  to  explore  the  effect  of  the  change  from 
tape-recorded  administration  to  pencil-and-paper  administration  end,  if 
possible,  to  project  the  results  of  past  assessments  onto  the  new  scale 
(see  Chapters  10  and  11).    Using  some  of  the  items  in  the  BIB  and  UBIB 
booklets,  four  tape  booklets  were  administered  to  age-eligible  students  of 
each  grade/age. 

Thus,  another  four  booklets  were  printed  for  each  grade/age  level  for 
administration  using  NAEP's  former  procedures.  These  booklets  were 
administered  in  separate  sessions  using  a  tape  recorder  for  directions.  All 
students  in  a  given  tape  session  were  administered  the  same  booklet.  Each 
of  the  four  tape  booklets  contained  two  sections:    a  section  of  common 
background  items  and  a  section  of  cognitive  reading  and  writing  exercises 


22 


ERIC 


39 


(three  booklets  contained  both  reading  and  writing  exercises  in  the 
cognitive  section;  one  contained  only  reading  exerciscc).  In  most  cases, 
the  cognitive  exercises  were  also  used  in  past  assessments  and  in  the 
spiral  booklets. 


2.5.3  Timing 

The  Grade  8/Age  13  students  vere  assessed  in  the  fall  of  1983,  the 
Grade  A/Age  9  students  vere  assessed  in  the  winter  of  198^.  and  the  Grade 
11/Age  17  students  vere  assessed  in  the  spring  of  198A. 

A  testing  session  lasted  approximately  one  hour.    The  BIB  and  UBIB 
booklets  took  A8  minutes  of  actual  testing  time;  tape  booklets  took 
approximately  53  minutes. 

See  Chapter  5  for  details. 


2.6    Instrument  and  Item  Inlormation 

The  assessment  incorporated  four  distinct  types  of  instruments: 
student  assessment  booklets;  a  questionnaire  for  excluded  students;  a 
teacher  questionnaire;  and  a  school  characteristics  and  policies 
questionnaire. 

The  student  assessment  booklets  vere  composed  of  items  that  vere  either 
cognitive  or  non-cognitive.    Cognitive  items  vere  readi.ig  exercises,  study 
skill  exercises  or  vriting  exercises.    Non-cognitive  items  asked  questions 
relative  to  the  backgrounds  and  attitudes  of  students.    Scire  non-cognitive 
items  vere  presented  to  every  student  and  vere  placed  together  in  a  block 
called  the  common  block  or  common  core.    Others  vere  placed  at  the 
beginning  of  the  blocks  containing  the  cognitive  items. 

The  reading  items  included  short  and  long  reading  passages,  graphically 
presented  materials,  poems,  and  reference  materials  (e.g.,  tables  of 
contents).    Some  items  required  a  multiple-choice  response,  some  open-ended 
items  required  a  brief  vritten  response,  and  some  required  written  essays. 
A  total  of  176  reading  items  vas  presented  to  Grade  A/Age  9;  a  total  of  192 
reading  items  vas  presented  to  Grade  8/Age  13;  and  a  total  Df  196  reading 
items  vas  presented  to  Grade  11/Age  17. 

The  vriting  items  vere  developed  to  assess  performance  in  three  vriting 
areas:     informative,  persuasive  and  imaginative.    Students  vere  asked  to 
write,  for  example,  letters,  descriptive  essays,  or  narrati/e  pieces. 
From  a  total  pool  of  22  vriting  items,  15  vere  used  at  each  graH^^/age. 

Each  booklet  included  six  minutes  of  background  and  attitude  items 
common  to  all  students.    These  items  are  general  questions  :oncerning 
materials  in  the  home,  parental  education,  etc.    Each  block  also  contained 
additional  background  and  attitude  items,  related  to  objectives  formulated 
for  reading  and  vriting.    The  items  measured  students'  perceptions  of  their 


23 


ERIC 


40 


teachers'  instructional  practices  in  reading  and  writing;  their  own  study 
habits  and  reading  activities;  their  perceptions  of  the  value  of  reading 
and  writing;  and  their  assessment  of  themselves  as  readers  and  writers. 

The  Excluded  Student  Questionnaire  was  developed  and  used  for  the  first 
time  in  the  Year  15  assessment.    It  was  designed  to  gather  more  information 
about  particular  conditions  for  exclusion  and  characteristics  of  the 
learning  experience  of  excluded  students* 

The  Teacher  Questionnaire  was  also  developed  and  used  for  the  first 
time  in  Year  15.    It  was  designed  to  gather  information  on  the  curricula 
and  teaching  methods  used  by  selected  English  and  Lang'^age  Arts  teachers. 

The  School  Characteristics  and  Policy  Questionnaire  was  distributed  to 
each  participating  school  to  be  completed  by  either  the  school's  principal 
or  another  person  familiar  with  data  concerning  enrollment,  facilities, 
curricula  and  staff  development. 

More  information  about  the  items  and  instruments  can  be  found  in 
Chapter  6. 


2.7    Field  Administration 

Westat  was  responsible  for  field  administration.  The  process  began  with 
the  development  of  necessary  materials  and  a  field  organization.  Materials 
were  developed  for  training,  contacting  the  schools,  sampling,  and  process 
control.  The  field  organization  consisted  of  district  supervisors  and 
exercise  administrators.    Westat  trained  the  district  supervisors,  who  in 
turn  trained  the  exercise  administrators. 

Gaining  school  cooperation  was  a  joint  effort  of  Westat  and  ETS.  ETS 
first  contacted  the  Chief  State  School  Officers  (CSSOs),  informing  them 
that  schools  within  their  states  had  been  selected  for  NAEP.  Later, 
mailings  and  materials  were  sent  to  the  CSSOs,  school  district 
superintendents  and  private  school  officials.    Meeting  arrangements  were 
then  established  by  telephone  and  contact  forms  were  filed  with  Westat. 
Westat  district  supervisors  then  scheduled  and  conducted  introductory 
meetings. 

Westat  administered  the  assessment  in  the  field  primarily  through  the 
work  of  district  supervisors.    District  supervisors  had  many 
responsibilities,  inv'iluding  drawing  the  sample  of  students,  completing 
assessment  reporting  forms,  making  final  arrangements  for  the  assessments, 
supervising  exercise  administrators,  distributing  and  collecting  other  data 
forms  and  questionnaires,  and  editing,  boxing  and  shipping  assessment 
materials. 

Both  Westat  and  ETS  were  responsible  for  quality  control.    There  were 
two  specif ical?y  designed  quality  control  studies  of  the  field  effort.  The 
first,  and  most  intensive,  involved  on-site  visits  by  Westat  and  ETS  staff 
to  verify  the  sampling  and  to  observe  the  supervisors  and  exercise 


24 


ERIC 


41 


administrators  as  they  conducted  assessments.    The  second  study  was  a 
telephone  survey  of  a  ten-percent  sample  of  schools.    This  survey  took 
place  after  the  field  period  had  ended  and  all  assessment  activities  had 
been  completed  in  the  schools. 

Field  administration  is  discussed  in  detail  in  Chapter  7. 


2.8    Database  Construction 

Westat  shipped  the  assessment  booklets  to  ETS  for  entry  into  computer 
files,  checking,  and  forming  the  database.  Careful  checking  assured  that 
all  data  from  the  field  were  received.  The  data  then  went  through  extensivfe 
processing,  described  in  Chapter  8. 

Since  machine  readable  assessment  booklets  were  not  used,  an 
"intelligent"  data  entry  system  was  developed.  This  computer  program  not 
only  received  the  input  data  but  also  checked  for  consistency  among  the 
many  different  booklets,  blocks,  and  formats.  The  program  assured  that  all 
entered  values  of  each  variable  were  within  the  range  of  possible  values. 
All  data  were  independently  key-verified  and  all  discrepancies  were 
resolved. 

Student  responses  to  some  of  the  reading  exercises  and  all  of  the 
writing  exercises  had  to  be  professionally  scored.  Professional  scorers 
were  hired,  trained,  and  closely  supervised.    Exercises  were  scored  by  both 
holistic  and  primary  trait  methods,  as  well  as  some  secondary  trait  and 
mechanics  scoring  methods.    Random  samples  of  essays  were  independently  re- 
scored  and  reliability  coefficients  were  estimated. 

Extensive  quality  control  checks  were  instituted  to  assure 
correspondence  between  what  had  been  written  in  the  booklet  and  what 
appeared  in  the  database.  A  random  sample  of  each  assessment  booklet  and 
questionnaire  was  selected  from  the  computer  file  and  checked  against  the 
original  document.  The  database  vas  determined  to  be  extraordinarily  error- 
free. 

The  construction  of  the  database  and  public-use  data  tapes  are 
described  in  more  detail  in  Chapter  8. 


2.9    Tabular  Summary  of  NAEP  Year  15  Sample 

The  purpose  of  this  section  is  to  present  the  characteristics  of  the 
Year  15  (1983-84)  NAEP  data  in  a  tabular  form.  This  section  is  a 
statistical  summary  of  the  result  i  of  the  data  collection  steps  outlined 
above  and  is  intended  to  describe  the  sample,  not  to  estimate  the 
characteristics  of  the  population  of  American  students. 


25 


There  were  three  samples  of  students  which  were  defined  by  being  at 
either  a  particular  age  or  a  particular  grade  level: 

Age  Class  1:  Grade  4/Age  9 
Age  Class  2:  Grade  8/Age  13 
Age  Class  3:  Grade  11/Age  17 

This  sample  was  designed  for  estimating  population  values  defined  either  by 
age  or  by  grade;  for  example,  the  sample  of  age  9  students  includes  9-year- 
old  students  in  other  grades  as  well  as  the  fourth  grade,  and  the  grade  4 
sample  contains  fourth  graders  of  all  ages,  not  just  9-year-olds. 

The  system  of  defining  age  that  was  used  in  past  assessments  was 
maintained  in  Year  15;  thus,  the  dates  of  birth  were  defined  as  follows: 

Age  9:  Born  between  January  1  and  December  30,  1974 
Age  13:  Born  between  January  1  and  December  30,  1970 
Age  17:  Born  between  October  1,  1966  and  September  30,  1967 

A  student's  grade  level  was  defined  by  the  school.  Note  that  only  17-year- 
olds  who  were  enrolled  in  school  were  sampled  in  Year  15;  out-of-school  17- 
year-olds  were  not  sampled. 


2.9*1    Measuremen t  Ins t rumon t s 

The  measurement  instruments  that  were  produced  by  ETS  are  summarized  in 
Table  2(2).  The  same  number  of  instruments  were  produced  at  each  grade/age 
level.  In  addition  to  these  instruments,  some  school  level  data  were 
available  from  a  Principal  Questionnaire  developed  by  Westat  and  from  the 
OED  database  that  was  used  in  developing  the  sampling  frame. 

The  number  of  items  used  in  the  measurement  instruments  varies  from  one 
age  class  to  another.  The  item  counts  are  shown  in  Table  2(3).  The  Total 
column  is  not  the  sum  of  the  three  grade/age  columns  because  some  items 
were  used  for  more  than  one  age  class. 

The  reading  and  writing  exercises  were  placed  in  blocks,  and  the  blocks 
placed  in  booklets  that  were  administered  to  either  the  spiral  sample,  the 
tape  sample,  or  both.  The  assignment  of  exercises  to  types  of 
administration  is  shown  in  Table  2(4). 


2.9.2    PS'J,  School,  and  Teacher  Sample  Characteristics 

Table  2(5)  shows  the  distribution  of  Primary  Sampling  Units  (PSUs)  that 
were  selected  for  the  Year  15  sample  by  region  of  the  country  and  by  the 
sampling  description  of  community.  The  sampling  frame  called  for  20  cells 
(4  regions  by  5  types  of  community)  but  the  Northeast  had  so  few  small 
places  and  extreme  rural  counties  (and  pseudo-counties)  that  they  were 
combined  for  sampling  purposes.  These  combined  strata  are  shown  as  a 
separate  column  in  this  table.  The  same  PSUs  were  used  for  all  age  classes. 


26 


ERIC 


43 


The  cooperation  rates  and  the  characteristics  of  thp  schools 
participating  in  this  NAEP  are  shown  in  Table  2(6).  There  was  a  total  of 
1,480  schools  in  the  assesr»nent  of  which  1,465  have  data  for  at  least  one 
student.  A  total  of  1,382  schools  returned  the  school  questionnaire.  We 
have  used  the  same  method  of  computing  cooperation  rates  as  used  in  the 
Year  13  assessment,  and  the  cooperation  rates  were  taken  from  the  report. 
Year  13  Field  Operations  and  Data  Collection  Activities  (Research  Triangle 
Institute,  1982).    The  Year  15  figures  were  taken  from  Westat's  Report  on 
Sample  Selection,  Weighting,  and  Variance  Estimation:    NAEP— Year  15  (Lago, 
Burke,  Tepping,  &  Hansen,  1985).    This  table  also  shows  the  distribution  of 
schools  by  region,  school  affiliation,  size  and  type  of  community, 
urbanicity,  grade  span,  number  of  teachers,  and  number  of  students. 

The  count  of  teacher  questionnaires  is  shown  in  Table  2(7).  A 
questionnaire  was  returned  by  a  total  of  2,732  teachers  over  all  age 
classes.  Of  these  teachers,  2,685  taught  at  least  one  student  for  whom  data 
are  available.  A  total  of  52,367  students  could  be  associated  with  a 
teacher  who  returned  a  questionnaire. 

The  mimbe**  of  assessment  sessions,  including  makeup  sessions,  is  shown 
in  Table  2(8).  The  sessions  are  shown  separately  by  spiral  and  tape 
administrations. 


2.9.3    Student  Sample  Characteristics 

Data  were  collected  for  a  total  of  10^,437  students  in  Year  15-  The 
number  of  students  who  were  administered  the  spiral  assessment  and  tape 
assessment  are  shown  in  Table  2(9,.  This  table  also  includes  the  number  of 
students  who  were  deemed  by  the  school  to  be    unable  to  respond  to  the 
assessment  situation  and  were  thus  excluded  from  the  sample. 

Tables  2(10),  2(11),  and  2(12)  show  the  sizes  of  various  subsamples 
from  the  spiral  sample  of  students  for  the  different  grade/age  levels. 
Subsa.nples  are  defined  by  sex,  race,  region  of  the  country,  parents' 
education,  and  size  and  type  of  community.  Sample  sizes  are  shown 
separately  for  age  eligibles,  grade  eligibles,  age  and  grade  eligibles,  and 
for  the  entire  age  class. 

Tables  2(13),  2(14),  and  2(35)  show  the  sizes  of  the  sai.ie  subsamples 
for  the  excluded  student's  by  grade/age  level.  Sample  sizes  are  shown 
separately  for  age  eligibles,  grade  eligibles,  and  age  and  grade  eligibles 
as  well  as  for  the  entire  age  class. 

Tables  2(16),  2(17),  and  2(18)  show  the  sizes  of  the  same  subsamples 
for  the  four  tape-administered  instruments.  These  samples  are  age  eligibles 
only. 


27 


The  Year  15  data  are  now  available  on  a  set  of  public-use  data  tapes, 
for  which  the  accompany.'ng  NAEP  1983-84  Public-Use  Data  Tapes  Version  3.1 

Users'  Guide  (Barone,  Norris,  &  Rogers,  1986)  has  been  prepared.  The  

public-use  data  tapes  contain  all  student,  teacher,  and  school  data  oxcep 
information  that  was  excised  to  preserve  the  respondents'  anonymity. 
Because  the  sampled  students  did  not  have  an  equal  probability  of 
selection,  the  sampling  weights  are  included  on  the  data  tapes.  The 
current  edition  of  the  public-use  data  tapes  also  contains  some  derived 
variables  such  as  reading  and  writing  proficiency  estimates.    The  public- 
use  data  tapes  developed  from  previous  a?sessments  by  the  Education 
Commission  of  the  States  are  also  still  available. 


28 


Table  2(2) 


Measurement  Instruments 
Developed  by  ETS 


 Grade/Age  

4/9  8/13  11/17 

BIB  BOOKLETS  57                    57  57 

UBIB  BOOKLETS  6                      6  6 

TAPE  BOOKLETS  4                      4  4 

EXCLUDED  STUDENT  QUESTIONNAIRES  111 

TEACHER  QUESTIONNAIRES  111 

SCHOOL  CHARACTERISTICS  QUESTIONNAIRES  111 

TOTAL  70  70  70 


29 

O 

ERIC 


Table  2(3) 
Number  of  Items  Administered 


 Grade/Age  

4/9  8/13  11/17  TOTAL 


READING 

176 

192 

196 

340 

WRITING 

15 

15 

15 

22 

BACKGROUND  AND  ATTITUDE 

260 

273 

376 

378 

EXCLUDED  STUDENTS 

72 

72 

72 

72 

TEACHER 

351 

293 

293 

417 

SCHOOL  cha:<acteristics 

247 

242 

247 

321 

TOTAL 

1121 

1087 

1199 

1550 

47 


30 


Table  2(A) 


ERIC 


Number  of  Reading  and  Writing  Exercises 
by  Type  of  Administration 


 Grade/Age  

A/9  8/13  11/17 


READING: 

SPIRAL  ONLY  78  88  92 

TAPE  ONLY  1  0  20 

SPIRAL  AND  TAPE  97  104  8A 

TOTAL  176  192  196 


WRITING: 

SPIRAL  ONLY  12  12  12 

TAPE  ONLY  0  0  0 

SPIRAL  AND  TAPE  3  3  3 

TOTAL  15  15  15 


31 


48 


Table  2(5) 


Allocation  of  PSUs 
to  Regions  and  Community  Types 


Northeast 


Big 
Cicy 

Urban 
Fringe 

Medium 
City 

Small 
Places 

Extreme 
Rural 

Small  Places 
&  Extreme 
Rural 

Tota 

NORTHEAST 

5 

3 

A 

2 

14 

SOUTHEAST 

3 

2 

4 

5 

2 

16 

CENTRAL 

5 

3 

4 

3 

3 

18 

WEST 

8 

1 

3 

2 

2 

16 

TOTAL 

21 

9 

15 

10 

7 

2 

64 

ERIC 


32 


Table  2(6) 
Characteristics  of  Schools 


4/9 


-Grade/Age- 
8/13 


11/17 


TOTAL 


TOTAL  NUMBER  OF  SCHOOLS  663 

NUMBER  WITH  DATA*  661 
NUMBER  WITH  COMPLETED 

QUESTIONNAIRES  623 

COOPERATION  RATE: 

YEAR  15  88.6 

YEAR  13  88.0 

REGION: 

NORTHEAST  151 

SOUTHEAST  U5 

CENTRAL  222 

WEST  145 

SCHOOL  TYPE: 

PUBLIC  522 

PRIVATE  42 

CATHOLIC  97 

NO  INFORMATION  2 

SI7.E  AND  TYPE  Of  TOMMUNITY: 

EXTREME  RURAL  69 

LO'w"  METROPOLITAN  68 

HIGH  METROPOLITAN  69 

MAIN  BIG  CITY  56 

URBAN  FRINGE  58 

MEDIUM  CITY  93 

SMALL  PLACE  250 

URBANICITY: 

URBAN  211 

SUBURBAN  207 

RURAL  243 

NO  INFORMATION  AVAILABLE  2 


486 
478 

457 


90.3 
89.2 


99 
116 
162 
109 


337 
46 
102 
1 


54 
51 
49 
47 
45 
62 
178 


183 
120 
182 
1 


331 
326 

302 


83.9 
86.5 


69 
80 
99 
83 


281 
31 

19 
0 


36 
31 
37 
25 
23 
46 
133 


82 
110 
139 
0 


1480 
1465 

1382 


88.1 
88.0 


319 
341 
483 
337 


1140 
119 
218 
3 


159 
150 
552 
128 
126 
201 
561 


476 
437 
564 
3 


*  Several  schools  were  sampled  but  no  eligible  students  were  selected, 
schools  are  retained  in  the  NAEP  database. 


These 


33 


Table  2(6) 


Characteristics  of  Schools 
(continued) 


4/9 


-Grade/Age- 
8/13 


11/17 


TOTAL 


GRADE  SPAN: 

KINDERGARTEN  TO 
KINDERGARTEN  TO 
KINDERGARTEN  TO 
KINDERGARTEN  TO 


GRADE 
GRADE 
GRADE 
GRADE 


OR  7 


GRADE  10 


TO 
TO 
TO 
TO 
TO 


GRADE  12 
GRADE  3 
GRADE  6 
GRADE  8 
GRADE  8 
GRADE  9 
GRADE  12 
GRADE  12 
GRADE  12 


NO  INFORMATION 


24 
19 
427 
175 
16 
0 
0 
0 
0 
2 


32 
0 

28 
204 
114 

41 

58 
8 
0 
1 


30 
0 
0 
1 
0 
7 

58 
194 

41 
0 


86 

19 
455 
380 
130 

48 
116 
202 

41 
3 


NUMBER  OF  TEACHERS: 


1  -  4 

22 

12 

3 

37 

5-9 

112 

58 

9 

179 

10  -  19 

251 

144 

50 

445 

20  -  49 

265 

209 

113 

587 

50  -  74 

9 

45 

56 

110 

75  -  99 

1 

10 

55 

66 

100+ 

1 

6 

44 

51 

NO  INFORMATION 

2 

2 

1 

5 

IBER  OF  STUDENTS: 

1  -  99 

30 

15 

7 

52 

100  -  299 

245 

134 

50 

429 

300  -  499 

240 

125 

61 

426 

500  -  749 

116 

116 

43 

275 

750  -  999 

20 

48 

27 

95 

1000  -  1499 

10 

34 

55 

99 

1500+ 

0 

13 

87 

100 

NO  INFORMATION 

2 

1 

1 

4 

34 


ERIC 


51 


Table  2(7) 


Number  of  Responses  to 
Teacher  Questionnaire 


 Grade/Age  

4/9  8/13  11/17 


TEACHERS*  1027  790  915 
TEACHERS  WITH  STUDENTS 

IN  SAMPLE  1005  779  901 

STUDENTS  WITH  TEACHERS**  14846  20838  16673 


TOTAL 


2732 

2685 
52357 


*  Some  teachers  responded  but  were  not  linked  to  any  student  in  the  sample. 
**  Teachers  were  often  associated  with  many  students. 


35 


ERIC 


Table  2(8) 


Number  of  Assessment  Sessions 
by  Type  of  Administration 


A/9 

NUMBER  OF  SESSIONS: 

SPIRAL  REGULAR  1328 
SPIRAL  MAKEUP*  2 

TOTAL  1330 


Grade/Age  

8/13  11/17  TOTAL 


1327  1327  3982 

4  93  99 

1331  1420  40tn 


NUMBER  OF  SESSIONS: 

TAPE  REGULAR  259  258                256  773 

TAPE  MAKEUP*  0  0                 67  67 

TOTAL  259  258                323  840 

*  See  Section  7.3.2  for  details  about  makeup  sessions. 


36 


ERIC 


53 


Table  2(9) 


Number  of  Students 
by  Type  of  Administration 


-Grade/Age- 
8/13 


11/17 


TOTAL 


SPIRAL 

TAPE 

EXCLUDED 


26087 
5^92 
U16 


28^05 
5158 
U^8 


28861 
6209 
1361 


83353 
16859 
4225 


TOTAL 


32995 


35011 


36431 


104/ "7 


1. ' 


ERIC 


Table  2(10) 

Spiral  .'^i.ple  by  Demographic  Characteristics 
Grade  4/Age  9 


TOTAL 

SEX: 
MALE 
FEMALE 

RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


AGE 

ELIGIBLE 


GRADE 
ELIGIBLE 


AGE  &  GRADE 
ELIGIBLE 


18945 


9496 
9449 


12635 
2800 
2640 
870 


20095 


10213 
9882 


13272 
3162 
2777 
884 


12953 


6091 
6862 


8920 
1819 
1614 
600 


TOTAL 


26087 


13618 
12469 


16987 
4143 
3803 
1154 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


4257 
4744 
5380 
4564 


4579 
5110 
5544 
4862 


3227 
3198 
3547 
2981 


5609 
6656 
7377 
6445 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


1089 
3628 
6885 
7343 


1204 
2490 
2216 
1500 
1991 
2913 
6631 


1285 
4106 
7465 
7239 


1305 
2721 
2336 
1503 
2068 
3097 
7065 


670 
2524 
5086 
4673 


760 
1698 
1724 
1060 
1423 
1915 
4373 


1704 
5210 
9264 
9909 


1749 
3513 
2828 
1943 
2636 
4095 
9323 


38 


ERIC 


55 


Table  2(11) 

Spiral  Sample  by  Demographic  Characteristics 
Grade  8/Age  13 


AGE  GRADE       AGE  &  GRADE 

ELIGIBLE      ELIGIBLE      ELIGIBLE  TOTAL 


TOTAL 
SEX: 


MALE 
FEMALE 


21070 


10526 
10543 


21850 


10928 
10'  "0 


1'4515 


6774 
7740 


28405 


14680 
13723 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


15047 
2922 
2398 
703 


15525 
3099 
2471 
755 


10820 
1774 
1428 
493 


19752 
4247 
3441 
965 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


4730 
5191 
6041 
5108 


4956 
5514 
6119 
5261 


3608 
3571 
4049 
3287 


6078 
7134 
8111 
7082 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


1870 
7427 
9495 
2278 


1215 
2189 
2-U> 
22?5 
2841 
2769 
7512 


2185 
7751 
9753 
2161 


1308 
2188 
2387 
2221 
2977 
2981 
7788 


1112 
5079 
7105 
1219 


796 
1369 
1867 
1595 
209/ 
1842 
4949 


2943 
10099 
12143 

3220 


1727 
3C08 
2839 
2851 
3721 
3908 
10351 


39 


Table  2(12) 

Spiral  Sample  by  Demographic  Characteristics 
Grade  11/Age  17 


AGE  GRADE       AGE  &  GRADE 

ELIGIBLE      ELIGIBLE      ELIGIBLE  TOTAL 


TOTAL 


22783 


22865 


16787 


28861 


SEX: 
MALE 
FEMALE 


11327 
11454 


11294 
11571 


8006 
8781 


14615 
14244 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


16482 
3345 
2192 
7640 


16681 
3331 
2054 
799 


13017 
1986 
1285 
499 


20146 
4690 
2961 
1064 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


5097 
5766 
6391 
5529 


5185 
5817 
6355 
5508 


3772 
4016 
5004 
3995 


6510 
7567 
7742 
7042 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL  2806  2761 

HIGH  SCHOOL  8018  7883 

GREATER  THAN  HIGH  SCHOOL  10957  11277 

UNKNOWN  1002  944 


1740 
5866 
8596 
585 


3827 
10035 
13638 

1361 


STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


1381 
2461 
2968 
2161 
2272 
3804 
7736 


1473 
2329 
3060 
2139 
2223 
3848 
7793 


1063 
1381 
2280 
1487 
1706 
2939 
5931 


1791 
3409 
3748 
2813 
2789 
4713 
9598 


40 


ERIC 


57 


Table  2(13) 

Excluded  Student  Sample  by  Demographic  Characteristics 

Grade  ^/Age  9 


TOTAL* 
SEX: 


AGE  GRADE      AGE  &  GRADE 

ELIGIBLE      ELIGIBLE  ELIGIBLE 


966 


826 


376 


TOTAL 


1416 


MALE 
FEMALE 


614 
351 


532 
291 


'k30 
145 


916 
497 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


460 
125 
275 
106 


369 
121 
247 
89 


160 
52 

123 
41 


669 
194 
399 
154 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


229 
224 
212 
301 


190 
167 
197 
272 


105 
68 
81 

122 


314 

323 
328 
451 


STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


47 

238 
94 
78 
113 
130 
266 


53 
212 
69 
57 
102 
112 
221 


13 
94 
49 
37 
60 
35 
88 


87 
356 
114 

98 
155 
207 
399 


*Some  demographic  subgroups  do  not  add  up  to  Total  due  to  missing  or  unresol 
subgroup  data.  * 


41 


56 


Table  2(U) 

Excluded  Student  Sample  by  Demographic  Characteristics 

Grade  8/Age  13 


AGE  GRADE       AGE  &  GRADE 

ELIGIBLE      ELIGIBLE         ELIGIBLE  TOTAL 


TOTAL*  907  901  360  U48 
SEX: 

HALE  579  561  219  921 

FEMALE  322  338  140  520 


RACE: 

WHITE  430  UlU  175  729 

BLACK  202  172  68  306 

HISPANIC  182  172  78  276 

OTHER  93  83  39  137 


REGION: 

NORTHEAST  154  159  69  244 

SOUTHEAST  216  243  74  385 

CENTRAL  290  291  119  462 

WEST  247  208  98  357 


STOC: 
RURAL 

DISADVANTAGED  URBAN 
ADVANTAGED  URBAN 

BIG  cm 

FRINGE 
MEDIUM 
SMALL 


34 
704 

54 
138 

88 
105 
314 


49 
183 
52 
98 
73 
124 
322 


11 

95 
14 
45 
36 
29 
130 


72 
292 

92 
161 
125 
200 
506 


*Some  demographic  subgroups  do  not  add  up  to  Total  due  to  missing  or  unresol /ed 
subgroup  data. 


42 


ERIC 


59 


Table  2(15) 

Excluded  Student  Sample  by  Demographic  Characteristics 

Grade  n/\ge  17 


TOTAL* 

SEX: 
MALE 
FEMALE 


AGE 
ELIGIBLE 


GRADE 
ELIGIBLE 


AGE  &  GRADE 
ELIGIBLE 


983 


640 
342 


707 


458 
248 


329 


218 
111 


TOTAL 


1361 


880 
479 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


371 
237 
249 
126 


311 
141 
168 
87 


135 
60 
98 
36 


547 
318 
319 
177 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


130 

258 
267 
328 


94 
176 
218 
219 


40 
79 
84 
126 


184 
355 
401 
421 


STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


48 
285 

59 
106 

74 
163 
248 


38 
154 
54 
74 
71 
122 
193 


20 
89 
26 
31 
28 
54 
81 


66 
350 

87 
149 
117 
231 
360 


*Some  demographic  subgroups  do  not  add  up  to  Total  due  to  missing  or  unresolved 
subgroup  data. 


43 


Gu 


Table  2(16) 
Tape  Sample  by  Demographic  Characteristics 
Grade  4/Age  9 


TAPE  1       TAPE  2       TAPE  3       TAPE  4  TOTAL 


TOTAL 


1403 


1356 


1389 


1344 


5492 


SEX: 
MALE 
FEMALE 


691 
712 


701 
655 


696 
693 


653 
691 


2741 
2751 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


970 
182 
186 
65 


869 
223 
203 
61 


914 
246 
181 
48 


832 
176 
263 
71 


3585 
829 
833 
245 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


510 
348 
409 
336 


335 
339 
365 
317 


275 
349 
411 
354 


288 
347 
364 
345 


1208 
1383 
1549 
1352 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOO: 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 


81 
247 
567 
508 


76 
280 
472 
528 


83 
284 
509 
513 


104 
277 
453 
510 


344 
1088 
2001 
2059 


STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


148 
156 
260 
130 
117 
243 
349 


114 
194 
183 
62 
152 
135 
515 


128 
254 
178 
168 
49 
136 
476 


74 
205 

90 
J  99 

70 
101 
605 


464 

809 
711 
559 
388 
616 
1945 


44 


ERIC 


61 


Table  2(17) 
Tape  Sample  by  Demographic  Characteristics 
Grade  8/Age  13 


TAPE  1       TAPE  2       TAPE  3       TAPE  4  TOTAL 


TOTAL 


1310 


1276 


1283 


1289 


5158 


ERIC 


SEX: 
MALE 
FEMALE 


676 
634 


636 
639 


637 
646 


680 
609 


2629 
2528 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


945 
187 
121 
57 


889 
211 
126 
50 


844 
226 
165 
48 


915 
160 
178 
36 


3593 
784 
590 
191 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


275 
327 
388 
320 


286 
334 
356 
300 


262 
329 
366 
326 


259 
333 
361 
336 


1082 
1323 
1471 
1282 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

STOC: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 


129 
464 
600 
117 


126 
16;., 
126 
142 
194 
184 
376 


92 

451 
574 
159 


62 

141 
81 
102 
240 
209 
441 


140 
487 
527 
129 


73 

206 
107 
75 
213 
168 
441 


114 
470 
567 
138 


146 
113 
123 
126 
180 
224 
377 


475 
1872 
2268 

543 


407 
622 
437 
445 
827 
785 
1635 


45 


Table  2(18) 
Tape  Sample  by  Demographic  Characteristics 
Grade  11/Age  17 


TAPE  1        TAPE  2       TAPE  3       TAPE  4  TOTAL 


TOTAL  1539 
SEX: 

MALE  774 

FEMALE  765 

RACE: 

WHITE  1065 

BLACK  263 

HISPANIC  148 

OTHER  63 

REGION: 

NORTHEAST  300 

SOUTHEAST  403 

CENTRAL  453 

WEST  383 

PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL  194 

HIGH  SCHOOL  537 

GREATER  THAN  HIGH  SCHOOL  749 

UNKNOWN  59 

STOC: 

RURAL  118 

DISADVANTAGED  URBAN  198 

ADVANTAGED  URBAN  249 

BIG  CITY  108 

FRINGE  208 

MEDIUM  240 

SMALL  418 


1540 

1596 

1534 

6209 

791 

796 

745 

3106 

749 

800 

789 

3103 

1  n7Q 

iU  /  7 

i  i  JO 

i  i  jU 

A  /.  Q  O 

242 

258 

193 

956 

163 

121 

172 

604 

56 

59 

39 

217 

325 

336 

327 

1288 

407 

405 

379 

1594 

391 

443 

459 

1746 

417 

412 

369 

1581 

194 

203 

161 

752 

558 

564 

543 

2202 

696 

779 

775 

2999 

92 

50 

55 

256 

58 

85 

106 

367 

181 

159 

179 

717 

190 

123 

221 

783 

126 

233 

141 

608 

132 

177 

170 

687 

249 

253 

240 

982 

604 

566 

477 

2065 

46 


ERIC 


63 


Chapter  3 

DEVELOPMENT  OP  THE  YEAR  15  NAEP  READING  AND  WRITING  ASSESSMENTS 

Ina  V.  S.  Mullis 
Educational  Testing  Service 


In  developing  each  subsequent  assessment,  NAEP  has  had  the  twofold 
responsibility  of  1)  measuring  trends  in  achievement,  and  2)  using  improved 
methods  to  measure  current  educational  objectives.    Because  fulfilling  the 
first  part  of  this  assignment  is  anchored  in  repeating  past  practices  and 
the  second  part  requires  innovative  new  measures,  accomplishing  NAEP's  dual 
goals  requires  ingenuity.    Many  conflicts  arise  naturally  in  developing 
unified  assessments  when  consultants  suggest  evaluation  approaches  that  a-e 
simply  beyond  the  scope  of  NAEP's  resources  and  capabilities.    Thus,  the 
development  process  for  each  assessment  must  be  undertaken  carefully— the 
process  is  akin  to  rebuilding  a  boat  while  keeping  it  afloat  throughout  the 
rebuilding.    Yet,  these  kinds  of  Ulemmas  are  familiar  to  NAEP,  and  new 
procedures  reflecting  the  lessons  of  experience  and  future  concerns  were 
systematically  introduced  into  the  Year  15  reading  and  writing  assessments. 

Responsibility  for  developing  the  materials  for  the  Year  15  assessment 
occurred  primarily  during  NAEP's  previous  grant  period  when  the  project  was 
administered  by  the  Education  Commission  of  the  States.    The  decision  to 
assess  writing  and  reading  in  1983-84  was  made  by  the  NAEP  Assessment 
Policy  Committee;  this  is  one  of  their  duties  specified  in  the  NAEP 
legislation.    It  should  be  further  noted  that  the  NAEP  legislation  also 
mandates  assessment  of  reading  and  writing  every  five  years;  given  a 
biennial  assessment  schedule,  both  subject  areas  had  to  be  assessed  in 
1983-84  to  comply. 

Prior  to  Year  15  (1983-84),  NAEP  had  most  recently  assessed  writing  in 
Year  10  (1978-79);  I.owever,  that  assessment  was  one  of  three  conducted  in 
that  year.    This  meant  that  the  Year  10  assessment  had  been  relatively 
limited  in  scope;  therefore,  NAEP  planned  a  much  larger  assessment  in  Year 
15.    As  a  result,  staff  knew  the  development  task  would  be  substantial  and 
work  began  in  March  of  1981.    In  contrast,  NAEP  had  conducted  a  very 
extensive  combined  assessment  of  reading  and  literature  in  Year  11 
(1979-80).    Thus,  it  was  felt  that  fewer  new  materials  would  need  to  be 
developed  for  the  Year  15  reading  assessment.    The  development  of  that 
assessment  began  in  the  fall  of  1982. 

The  following  represent  some  of  the  major  issuf'S  addressed  by  the 
previous  NAEP  staff  in  developing  the  new  materials  for  the  Year  15  reading 
and  writing  assessments: 


47 


64 


Although  the  decision  to  assess  writing  in  Year  15  was  made 
prior  to  the  decision  to  also  assess  reading,  once  reading 
assessment  was  underway  the  development  of  the  '■wo  areas  was 
coordinated  as  much  as  possible.    The  Year  15  assessment  was 
almost  conceptualized  as  one  single  area  with  three 
components— reading,  writing,  and  writing  about  reading. 

NAEP  developed  a  large  number  of  background  questions, 
particularly  in  the  area  of  writing.    The  panels  expressed  a 
strong  desire  to  develop  questions  that  would  lead  to  a 
greater  understanding  of  writing  instruction,  student  writing 
practices,  and  student  perception  of  the  value  of  being  able 
to  write  well. 

NAEP  had  routinely  developed  school  questionnaires,  but  for 
the  first  time  teacher  questionnaires  were  developed  to  be 
administered  to  English  teachers.    The  questions  reflected 
areas  emerging  from  the  effective  schools  research  as 
conducive  to  improved  performance  as  well  as  those  of  special 
concern  to  reading  and  writing  educators. 

NAEP  had  previously  reported  writing  achievement  results 
based  on  relatively  few  writing  tasks  at  each  age  level. 
NAEP's  technical  advisory  committees  had  expressed 
reservations  about  the  generalizability  of  the  results  and 
urged  staff  to  assess  a  greater  number  and  variety  of  writing 
tasks  at  each  age  level. 

The  high  priority  given  to  the  issue  of  the  writing  process 
by  NAEP  panels  was  clear.    A  great  deal  of  effort  was 
expended  trying  to  develop  assessment  methods  to  allow 
students  to  engage  in  the  writing  process  without  dictating 
that  they  use  particular  strategies.    Given  the  parameters  of 
NAEP  procedures  and  capabilities,  however,  the  final  decision 
was  made  to  assess  students'  use  of  the  writing  process  in  a 
limited  fashion. 

As  a  result  of  the  Year  11  combined  reading  and  literature 
assessment,  many  of  the  reading  comprehension  passage*;  were 
literary  in  nature.    Givtn  the  importance  of  reading  across 
the  curriculum  and  in  out-of-school  situations,  NAEP 
concentrated  on  a  new  design  based  on  assessing  reading  using 
social  studies,  science    and  literary  passages,  as  well  as 
functional  materials. 

Based  on  previous  assessments,  NAEP  was  concerned  that  the 
reading  assessment  pool  had  very  few  materials  that 
challenged  17-year-olds.    Thus,  an  effort  was  made  to  develop 
more  sophisticated  measures  of  reading  comprehension  for  use 
at  that  level. 


48 


^.,.oW  "?        sections,  the  first  describing  the  process  used  to 

HPvolnLI^f  assessment  and  the  second  presenting  details  about  the 

development  of  the  reading  assessment,  provide  further  intormation  about 
the  procedures  and  the  consultants  used  to  develop  and  select  the 
assessment  items,  the  issues  raised,  and  the  final  decisions  regarding  the 
Year  l!)  assessment  tasks,  background  questions,  and  questionnaires. 

3-1    Developing  the  Year  15  Writing  Assessment 

Prior  to  the  Year  15  assessment,  NAEP  had  assessed  writing  three 
times-in  Year  1  (1969-70),  Year  5  (1973-74),  and  Year  10  (1978-79). 
However,  NAEP  had  published  only  two  previous  sets  of  writing  objectives, 
one  in  1969  and  the  other  in  1972,  with  a  brief  supplement  added  for  the 

tMn!^^'^"'*!]'    "  ^"'"''^  national  writing  assessment, 

1    ?Ll^^^f  ^"  writing  education  that  had  taken  place  by  the 

early  1980s  dictated  a  total  recasting  of  the  1972  objectives  rather  than 
further  modification.    Therefore,  in  March  of  1981,  the  Writing  Advisory 
Committee  met  for  the  first  time  to  accomplish  several  tasks  crucial  to  the 
development  o    NAEP's  Year  15  writing  assessment.    The  first  Sas  to  begin 
development  of  a  draft  of  new  writing  objectives  for  the  Year  15 
assessment;  the  second  was  to  develop  a  statement  of  recommendations  for 
the  design  of  the  assessment;  and  the  third  to  plan  the  development  of  the 
writing  assessment  items. 


3- 1-1    The  Year  15  Writing  Objectives 

The  booklet  Writing  Objectives,  1983-84  Assessment  (1982)  contains  the 
names  of  the  Year  lb  Writing  Advisory  Committee  and  the  numerous 
consultants  who  participated  in  developing  those  objectives. 

ir,Au}lA  objectives  for  the  Year  15  assessment  are  based  on  the  premise  that 
individuals  generally  write  for  a  purpose  and  an  audience.  Some  writing  is 
personal,  intended  for  oneself  or  perhaps  an  intimate  friend,  whereas  other 
writing  is  more  public  and  is  intended  to  communicate  ideas  and  experiences 
to  others.    These  objectives  distinguish  between  two  different  major 

r;Sv^^f  ^''^  ""^^^  objective  I-Students  Use  Writing  as 

a  Way  of  Thinking  and  Learn^ng-and  describing  the  second  under  Objective 
Il-Students  Use  Writing  to  Accomplish  a  Variety  cf  Purposes.    Objective  I 
discusses  the  ways  in  which  students  may  undertake  personal  kinds  of 
writing  as  a  way  of  improving  thinking  skills  and  of  learning  both  subject 
knowledge  and  knowledge  about  themselves.    Objective  II  deals  with  the 
types  of  writing  students  are  more  likely  to  do  in  school  or  social 
settings.    Objective  II  presents  three  primary  purposes  for  public  writing- 
informative,  persuasive,  and  literary.    There  are,  of  course,  other  ways  to 
describe  these  purposes  of  writing,  and  earlier  sets  of  NAEP  objectives 
used  somewhat  different  terminology. 


rom  an 


One  major  shift  in  the  focus  of  writing  education  has  been  fro,., 
emphasis  on  writing  products  to  an  emphasis  on  the  writing  process. 
Objective  III,  Students  Manage  the  Writing  Process- -reflects  this  change  in 


49 


6o 


focus.    To  discuss  the  process,  it  is  necessary  to  present  its  components 
as  if  they  are  discrete  operations;  however,  in  reality  th^y  are  interwoven 
parts  of  the  entire  process  and  not  readily  separable  in  practice.  The 
recursive  nature  of  the  writing  process  and  the  interdependence  of  the 
generating,  drafting,  revising,  and  editing  skills  it  requires  cannot  be 
overemphasized.    Objective  IV — Students  Control  the  Forms  of  Written 
Language — discusses  control  of  such  skills  as  organizing,  elaborating  and 
appropriately  using  the  conventions  of  writing  (usage  and  mechanics). 
Objective  V — Students  Appreciate  the  Value  of  Writing — underscores  the 
importance  of  students'  leaining  why  writing  is  a  valuable  personal  and 
social  activity. 


3,1.2    Writing  Objectives  Development 

As  can  be  seen  from  the  following  description  of  the  objectives 
development  process,  a  wide  range  of  people  interested  in  writing  education 
participated  in  the  creation  of  the  writing  objectives  for  the  Year  15 
assessment.    Work  was  done  primarily  in  conferences,  ^.onducted  by  NAEP 
staff  and  consisting  of  approximately  five  to  eight  external  consultants 
who  drafted,  revised,  or  reviewed  the  evolving  objectives  document.  To 
maintain  continuity,  these  conferences  usually  involved  one  or  more  members 
of  the  advisory  committee.    However,  the  purpose  underlying  the  series  of 
conferences  described  below  was  to  adhere  to  NAEP's  consensus  development 
process  and  involve  people  with  as  many  different  viewpoints  as  possible  in 
the  development  of  the  objectives.    Subject-area  specialists,  parents, 
classroom  teachers,  school  superintendents,  curriculum  specialists,  state 
writing  assessment  personnel,  and  school  administrators  were  all  involved. 
All  of  these  contributors  and  reviewers  were  chosen  to  reflect  the 
perspectives  of  people  in  various  sizes  and  types  of  community,  from  many 
geographic  regions,  and  from  various  racial/ethnic  groups. 

In  March  of  1981,  the  Writing  Advisory  Committee  began  its 
deliberations  on  the  new  objectives  by  discussing  the  National  Council  of 
Teachers  of  English  (NCTE)  statement  on  Standards  for  Basic  Skills  Writing 
Programs.    This  document  and  the  input  of  tl.-^  committee  members  resulted  in 
an  outline  for  the  Year  15  objectives  as  well  as  drafts  of  supporting  t^xt. 
The  committee  reconvened  in  July  of  1981  to  review  and  revise  the  drafts 
written  by  participants  in  the  earlier  meeting  and  by  committ3e  members  in 
the  interim.    Concerns  voiced  about  the  original  draft  centered  around  the 
fact  that  the  writing  process  appeared  as  discrete  categories  when  in 
actuality  these  processes  are  varied  and  nonlinear. 

Based  on  the  work  of  the  July  meeting,  and  concerns  raised,  staff 
worked  to  pref  ^re  a  draft  document.    This  doci  'O'^t  was  reviewed  by 
additional  writing  specialists  at  a  meeting  in  early  October,  revised  by 
staff  and  consultants  based  on  that  review  and  shared  with  ctate  writing 
assessment  personnel  later  that  month.    Finally,  in  late  October  a 
consultant  group,  including  both  advisory  committee  members  and  additional 
writing  educators,  met  to  complete  the  draft  and  try  to  respond  to  all 
concerns  raised  by  previous  reviews. 


50 


In  November  of  1981,  the  Lay  Review  Committee  met  to  review  the  Year  15 
Writing  Assessment  Objectives  draft.    Recommenc^at ions  made  by  the  lay 
reviewers  included  expanding  the  section  that  dealt  with  the  value  of 
writing  and  adding  a  section  that  would  offer  practical  suggestions  fct  the 
application  of  the  objectives. 

The  Writing  Advisory  Committee  met  again  in  December  of  1981  to 
continue  discussion  and  revision  of  the  objectives  draft.    Later  that  month 
the  revised  draft  was  reviewed  by  two  groups  of  external  reviewers— the 
first  was  a  group  of  writing  researchers  and  the  second,  a  group  of 
elementary,  middle,  and  high  school  teachers.    Many  suggestions  for  the 
section  about  putting  the  objectives  into  practice  were  obtained  at  the 
latter  conference. 


In  February  of  1982,  another  group  of  consultants  was  convened 
comprising  both  ^dvisory  committee  members  and  a:ditional  external  writing 
consultants.    This  group  discussed  recommendations  from  the  previous  two 
conferences  and  developed  a  revised  objectives  draft.    Further,  the  section 
on  "putting  the  objectives  into  practice"  was  developed  and  incorporat^^d 
into  the  draft.    Teachers  and  curriculum  specialists  were  invited  to 
participate  in  a  conference  to  review  the  latest  draft.    Staff  revised  the 
draft  based  on  this  review  and,  in  March  and  April,  sent  the  revised  draft 
out  for  a  mail  review  by  approximately  30  consultants  representative  of  a 
variety  of  backgrounds  and  perspectives.    Staff  addressed  the  concerns  of 
the  mail  reviewers  and  shared  the  resultant  draft  with  a  group  of 
curriculum  and  i    rructional  superintendents  and  coordinators  from  across 
the  country. 

In  May  of  1982,  a  working  group  of  consultants  was  convened  to  address 
the  concerns  of  ihe  curriculum  review.    The  work  of  this  cohimittee  was 
reviewed  by  a  subsequent  group  of  consultants  in  June,    Then,  in  late  June, 
the  Writing  Advisory  Committee  met  to  review  the  objectives  draft  and  make 
final  revisions.    This  draft  was  reviewed  for  bias  during  July  and  August 
by  members  of  the  National  P.T.A.    The  final  review  of  the  objectives 
booklet  by  external  consultants  was  conducted  at  an  August  confe*:ence.  The 
objectives  were  then  edited  and  published  by  the  Education  Commission  of 
the  Stales  in  late  1982. 


3.1.3    Writing  Exercise  Development 

Since  the  first  objective,  writing  to  learn,  seemed  nearly  impossible 
to  measure  under  the  time,  resource,  and  paper/pencil  methodological 
constraints  of  current  assessment  procedures,  the  NAEP  Writing  Advisory 
Committee  decided  that  writing  tasks  should  focus  on  measuring  performance 
in  informative  writing,  persuasive  writing,  and  literary  writing.    In  view 
of  the  objectives  and  past  assessment  exper'ence,  staff  and  consultants 
decided  to  strengthen  the  practice  of  assessing  several  kinds  of  discourse 
on  the  grounds  that  students  may  be  proficient  in  some  kinds  of  writing  but 
not  in  others.    Although  some  of  the  same  skills  are  involved  in  o^ch  kind 
of  writing,  NAEP  results  amply  illustrate  that  there  are  challenr  and 
strategies  unique  to  each  writing  task. 


51 


In  addition,  some  information  would  be  collected  about  student  ability 
to  manage  the  writing  process  (Objective  III).    Controlling  the  forms  of 
written  language  would  be  addressed  by  evaluating  sample  responses  for 
organization,  cohesion,  syntax,  usage,  and  mechanics;  information  about  how 
students  perceive  writing  and  writing  instruction  would  be  collected  using 
multiple-choice  instruments. 

In  summary,  the  writing  tasks  developed  for  the  Year  15  writing 
assessment  were  to  measure  student  writing  performance  in  the  areas  of 
informative  writing,  persuasive  writing,  and  literary  writing,  with  some 
tasks  including  opportunities  for  pre-writing  and/or  revision  to  gather 
information  about  student  familiarity  and  success  in  engaging  in  the 
writing  process.    Information  about  the  writing  process  and  students' 
perception  about  the  value  of  writing  would  be  measured  using 
multiple-choice  scales. 

Given  this  broad  guidance  from  the  Advisory  Commiltee,  NAEP  consultants 
and  staff  began  devel'>ping  new  exercises  for  the  Year  15  writing  assessment 
in  March  of  1981.    A  list  of  consultants  who  participa<.6d  in  the  process  is 
found  at  the  end  of  the  section  on  writing  assessment  development.  Several 
factors  contributed  to  the  scope  of  the  task.    First,  NAEP  did  not  have 
many  writing  tasks  available  from  the  Year  10  assessment  and  in  view  of  the 
concerns  about  generali^abi 11 ty  of  results  expressed  by  the  technical 
committees,  NAEP  was  very  eager  to  enlarge  the  coverage  of  variety  of 
aspects  of  writing  for  the  Year  15  assessment.    In  addition,  the  tasks  in 
that  assessment  did  not  attend  to  students'  managing  the  writing  process, 
and  NAEP  was  very  interested  in  field  testing  many  different  formats  for 
ta55ks  that  allowed  students  to  engage  in  the  writing  process.  The 
committee  felt  adamantly  that  the  writing  process  is  internaliz^id  and 
implemented  differently  by  different  people  and  should  noc  be  structured  or 
regimented  by  the  assessment  situation.         it  transpired,  none  of  the 
formats  was  very  successful  in  both  allov^ing  flexibility  and  "forcing 
students  to  provide  specific  evidence"  that  they  had  engaged  in  the  writing 
process.    Eventually  the  Writing  Advisiory  Committee  sugge^rted  collecting 
mort  of  the  information  about  students'  use  of  the  writing  process  through 
background  questions  and  leaving  traditional  procedures  for  administering 
writing  tasks  in  place.    Finally,  NAEP  had  very  few  background  questions 
from  the  previous  assessment,  and  the  objectives  emphasized  the  writing 
process  as  well  as  students'  attitudes  and  values  toward  writing.  In 
short,  NAEP  had  planned  an  ambitious  writing  assessment  for  Year  15  and 
most  of  the  materials  to  implement  that  assessment  needed  to  be  newl> 
developed. 

The  first  exercise  development  conference  in  March  of  1981  focused  on 
developing  measures  of  students'  attitudes  toward  writing  and  the  value 
they  placed  on  it.    Pevc^>pment  of  writing  tasks  was  initiated  at  an  April 
conference.    In  May  of  1981,  three  item  writing  conferences  were  held — two 
to  develop  writing  tasks  and  one  to  develop  background  measures  about 
writing  instruction  and  the  writing  process.    In  June  an  exercise  review 
conference  was  held  and  both  cogniti^^e  and  non-cognitive  measures  were 
reviewed  and  revised.    Yet  another  exercise  development  conference  was 


52 


63 


conducted  later  that  month  to  address  concerns  raised  by  the  review 
committee. 


The  Writing  Advisory  Committee  met  in  July  to  review  all  the  measures 
developed  during  the  spring  of  1981.    Based  on  its  advice  and  further 
direction,  two  more  exercise  development  meetings  were  held,  one  in  July 
the  first"fieirEest"°''  ^°  P'^^paring  the  clearance  package  and  conducting 

In  October  and  November  of  1981,  field  tests  of  both  the  writing  tasks 
and  various  types  of  background  questions,  20  booklets  total,  were 
conducted  at  a  variety  of  sites  around  the  country.    All  of  these  items 
were  reviewed  by  the  Lay  Review  Committee.    The  results  of  the  field  tests 
were  reviewed  by  the  Writing  Advisory  Committee  and  by  an  exercise 
development  review  committee  in  December.    At  a  January  1982  development 
conference,  some  items  were  revised  on  the  basis  of  the  December  reviews 
and  additional  new  items  were  developed.    In  February  the  entire  pool  of 
items  was  reviewed  by  a  committee  of  teachers  representing  elementary, 
junior  high  and  high  school.    The  comments  and  suggestions  from  the 
teachers'  review  were  addressed  at  an  item  review  and  development 
conference  held  in  .March. 

In  April  and  May,  nine  booklets  of  newly  developed  and  revised  items 

.®  ;  ^i'^^  ^'^^^  ^"^  country.    The  results  of  these 

field  tests  were  shared  at  a  ir.eeting  with  teachers  in  June,  and  a  writing 
development  conference  was  conducted  later  that  month  to  revise  materials 
based  on  the  field  tests  and  teachers'  suggestions.    The  Writing  Advisory 
Committee  met  in  June  and  reviewed  the  existing  pool  of  items.  Major 
difficulties  with  the  items  focused  on  trying  to  increase  the  quality  and 
length  of  the  responses  to  the  writing  tasks  and  trying  to  find  the 
vocabulary  to  ask  students  about  their  instruction  and  use  of  the  writing 
process.    These  concerns  precipitated  two  additional  exercise  review  and 
development  conferences  held  in  the  month  of  August. 

Very  substantial  field  te:5ts  were  conducted  in  October  of  19P2:  thirty 
booklets  per  age  were  tested  in  seventeen  sites  across  the  country.  The 
results  were  reviewed,  items  revised,  and  a  subsequent  field  test  conducted 
in  December  at  twelve  sites  across  the  country.    The  Writing  Advisory 
Committee  met  in  January  of  1983  to  review  the  entire  pool  ot  items  Ud  to 
make  the  preliminary  selections  of  both  the  writing  tasks  and  background 
questions  for  the  assessment.    Based  on  this  selection,  staff  prepared  the 
writing  materials  for  inclusion  in  the  clearance  package  of  the 
non-cognitive  items  for  the  Year  15  assesjment  which  ECS  submitted  to  NIE 
on  February  3,  1983. 

Subsequent  to  the  transfer  of  the  NAEP  project  to  Educational  Testing 
bervice,  £TS  staff  and  consultants  used  these  materials  as  well  as  the 
cognitive  items  to  select  the  writing  items  for  inclusion  in  the  Year  15 
assessment.    The  selected  writing  tasks  and  guides  as  well  as  the 
background  and  attitude  measures  were  further  reviewed  by  subject  matter 
specialists  and  editors,  as  well  as  for  bias,  according  to  standards 
established  by  Educational  Testing  Service.    These  materials  became  the 


53 


second  and  final  set  of  writing  items  submitted  for  clearance  for  the  Year 
15  assessment. 


A  complete  description  of  the  writing  tasks  eventually  assembled, 
printed,  and  administered  as  part  of  the  Year  15  writing  assessment  is 
found  in  Section  3.1.5. 


3.1.4    Writing  Exercise  Development  Issues 

3.1.4.1    Definitions  of  Types  of  Writing  Tasks 

The  decision  to  increase  the  number  of  informative,  persuasive  and 
literary  writing  tasks  raised  an  issue  of  considerable  consequence — domain 
definition.    Generally,  mathematical  operations  have  been  well  defined  and 
considerable  effort  has  been  devoted  to  describing  the  specifics  of 
science,  social  studies,  and  reading.    In  contrast,  relatively  little  had 
been  done  to  classify  subtasks  within  purposes  for  writing.    This  problem, 
of  course,  could  not  be  tackled  in  its  entirety.    However,  enough  progress 
was  made  to  create  writing  task  development  frameworks  and  provide  item 
writers  with  specifications.    An  overview'  of  these  frameworks  follows. 


Informative  Writing  (Objective  II  A) 

Briefly,  writing  to  inform  others  can  involve  reporting  and 
retelling  events  or  experiences.    It  can  also  involve  analyzing  or 
examining  concepts  and  relationships  or  developing  new  hypotheses 
or  generalizations  from  existing  records,  reports,  and 
explanations.    Tasks  developed  to  measure  informative  writing  can 
range  from  simple  note  taking  and  recourting  events  to  explaining 
concepts  and  supporting  generalizations,  with  particular  attention 
to  a  balance  between  lower-level,  or  reporting,  tasks  and  those 
tasks  requiring  higher-level,  or  analytic,  skills. 

Writing  taC^j  requiring  informational  writing  were  designed 
to  cover  a  range  of  difficulty  levels,  a  range  of  audiences,  a 
range  of  stimulus  materials  (including  personal  experience  and 
given  materials)  and  a  variety  of  writing  situations.    The  two 
major  classifications  were  reporting  and  analysis,  with 
subclassif ications  of  each.    For  example,  in  the  area  of 
reporting:    a  note  about  where  the  student  went  after  school;  a 
letter  of  complaint;  instructions  on  how  to  feed  a  pet;  and  a  job 
application  represented  various  fu:  ctional  writing  tasks.  Writing 
reports  from  diagrams  or  notes  represented  the  kinds  of  wtxting 
tasks  that  may  be  required  in  school  or  business  situations.  In 
addition  to  these  two  contexts  for  writing,  students  were  asked  in 
some  tasks  to  write  on  the  basis  of  given  information  and  in 
others  to  write  based  on  personal  experience. 


54 


ERIC 


The  analysis  tasks  also  were  designed  to  give  respondents  the 
opportunity  to  use  both  personal  experience  and  given  material  as 
the  basis  for  presenting  and  supporting  their  ideas-    These  tasks 
were  designed  to  measure  higher-order  skills  by  requiring 
respondents  to  advance  from  reporting  facts  to  providing 
explanations.    Again,  there  was  an  effort  to  represent  both  school 
and  non-school  contexts. 


Persuasive  Writing  (Objective  II  B) 

Persuasive  writing  may  entail  responding  to  requests  for 
advice  by  giving  an  opinion  and  supporting  reasons.    However,  it 
usually  involves  initiating  an  attempt  to  convince  readers  by 
setting  forth  one's  own  point  of  view  with  evidence  to  back  it  up. 
Argument,  with  refutation,  becomes  part  of  persuasion  when  the 
writer  knows  there  is  opposition  to  what  he  or  she  is  advocating. 
Thus,  persuasive  writers  must  be  concerned  with  the  positions, 
beliefs,  or  attitudes  of  particular  readers  and  with  the 
possibility  of  winning  their  support  cr  changing  their  beliefs  or 
attitudes. 

Tasks  designed  to  measure  persuasive  writing  capabilities 
included  items  carefully  constructed  to  range  from  advice-giving 
to  refutation.    More  specifically,  "  mvince"  items  required 
students  to  give  an  opinion  and  the  supporting  evidence  that  would 
sway  a  particular  audience;  and  "refute"  items  required  studentr 
to  take  a  position  contrary  to  that  of  their  audience  and  to  give 
evidence  that  would  advance  their  position  and  refute  the 
expressed  concerns  of  their  audience. 


Literary  Writing  (Objective  II  C) 

Literary  writing  provides  a  special  way  of  sharing 
experiences  and  understanding  the  world.    There  are  a  wide  variety 
of  forms  that  literary  writing  can  take,  such  as  stories,  poems, 
plays,  or  lyrics.    However,  given  the  context  of  the  assessment, 
the  panels  decided  to  focus  the  development  of  literary  writing 
tasks  in  the  area  of  storytelling.    Tasks  requiring  both 
imaginative  narratives  and  personal  experience  narratives  were 
developed  to  offer  students  opportunities  to  writ?  from  a  basis  of 
imaginative  ideas  and  a  basis  of  their  own  experience.  Also, 
several  tasks  were  developed  which  asked  students  to  attempt 
modest  poems.    Given  the  resource  constraints  of  the  actual 
assessment,  the  final  selection  reflected  only  imaginative 
narratives. 


55 


1 


72 


3.1.4.2    Prior  Knowledge  Bias 


The  controversy  here  was  concerned  with  how  best  to  reduce  the  effects 
of  prior  knowledge  on  performance  without  adversely  affecting  the  level  of 
that  performance.    The  optimum  strategy  in  designing  "fair"  writing  tasks 
requires  respondents  to  have  equal  levels  of  prior  knowledge  about  each 
topic,  where  equal  level  can  be  defined  as  ranging  from  little  or  no 
knowledge  to  extensive  knowledge.    However,  It  is  also  agreed  that  more 
effective  writing  is  produced  when  authors  have  extensive  knowledge  or  at 
least  some  familiarity  with  the  subject.    Unfortunately,  NAEP  collects 
information  from  a  national  sample  of  students  and  universally  appealing 
topics  are  extremely  rare.    Recently,  even  such  traditional  stimulus 
standbys  as  pets,  vacations,  and  basic  emotions  have  become  suspect.  Thus, 
the  effort  to  reduce  bias  led  ironically  to  writing  topics  that  very  few 
people  were  likely  to  know  or  care  about;  eventually,  an  insidious  banality 
pervaded  the  entire  item  pool.    This  was,  of  course,  troubling,  since 
students  could  not  possibly  be  inspired  to  do  their  best  writing  with  such 
bland  prompts.    The  solution  was  to  compromise — some  universal  topics  and 
some  obscure  topics,  some  based  on  personal  experience  and  some  on  given 
material,  some  rural  and  some  urban,  some  for  girls  and  some  for  boys,  etc. 
Thus,  when  achievement  is  summarized  across  the  total  set  of  topics, 
chances  for  better  performance  should  be  maximized,  while  bias  is 
minimized. 


3.1.4.3    Should  Audience  be  Specified  in  Tasks? 

Must  a  writing  assignment  specify  an  intended  audience?    Some  writing 
task  developers  said  yes,  some  said  no.    The  "no  audience"  argument  is  that 
the  respondents  know  their  papers  will  be  read  by  teacher-like  graders. 
Therefore,  specifying  an  audience  other  than  a  NAEP  reader  brings  an 
artificiality  to  any  task  that  will  jeopardize  performance.    At  the  other 
extreme,  and  equally  adamant,  were  those  who  insisted  that  At  is  impossible 
to  do  all  on  a  writing  task  unless  a  specific  audience  is  identified.  Once 
again  the  consensus  process  yielded  a  compromise.    The  persuasive  tasks 
delineate  audiences  to  create  a  context  for  the  persuasion,  while  the 
literary  items  rarely  specify  an  audience  on  the  grounds  that  the  writer  is 
frequently  his  or  her  own  primary  audience.    Some  informative  tasks  have 
audiences,  whereas  in  others  the  audience  is  left  to  the  imagination  of  the 
writer. 


3.1.4.4    How  to  Assess  the  Writing  Process 

In  dealing  with  this  issue,  NAEP  vas  not  so  much  faced  with  reconciling 
opposite  points  of  view  espoused  by  different  writing  experts  as  ith 
reconciling  paradox  expressed  by  almost  every  advisor.    The  importance  of 
focusing  writing  instruction  on  the  writing  process  was  clear.  Advisors, 
item  developers,  and  slaff  desperately  wanted  such  measures.    However,  it 
was  unanimously  agreed  that  each  person  writes  best  when  allowed  to  engage 
in  the  process  as  they  have  found  it  most  effective.    This  dilemma  was 
exacerbated  by  the  knowledge  that  students  have  not  done  well  in  past 


56 


ERIC 


73 


assessments  when  given  opportunities  to  revise,  as  they  do  not  appear  to 
know  what  to  do.    Therefore,  on  one  hand,  it  was  deemed  necessary  to  give 
students  some  help  by  specifying  the  steps  they  should  engage  in  to 
accomplish  the  process;  on  the  other  hand,  forcing  students  through  various 
steps  without  allowing  for  flexibility  was  considered  detrimental. 
Unfortunately,  given  the  parameters  of  NAEP  procedures  and  capabilities, 
there  was  not  a  satisfactory  solution  to  this  measurement  problem. 
Therefore,  a  very  few  items  were  developed  that  attempted  to  measure 
aspects  of  the  writing  process.    Due  to  resource  limitations,  none  were 
included  in  the  assessment,  although  one  successful  item  type  required 
students  to  rewrite  and  improve  a  given  piece  of  writing,  rather  than  their 
own  writing.    Some  of  these  included  suggested  directions  for  improvemenis. 
Field  tests  indicated  that  students  may  be  more  successful  with  these  tasks 
than  they  are  when  asked  to  revise  their  own  writing.    Finally,  numerous 
questions  were  developed  that  asked  students  about  the  prominence  of  the 
writing  process  in  their  instruction  and  wheth<jr  they  engage  in  various 
aspects  of  the  writing  process  or  utilize  particular  strategies  when  they 
write. 


3.1.4.5    Summary  of  Writing  Task  Development  Issues 

Thus  far  this  chapter  ha.,  summarized  NAEP  discussf  ^ns  about  some 
writing  evaluation  issues  raised  during  the  development  of  the  fourth 
writing  assessment.    What  follows  outlines  a  somewhat  expanded  overview  of 
the  problems  faced.    The  NAEP  resolutions  described  were  often  not  the 
preferred  resolutions,  but  represented  compromises  baced  on  the  reality  of 
assessment  capabilities  and  resources. 

(1)  Should  writing  be  assessed  solely  by  collecting  writing 
samples,  or  should  some  less  costly  multiple-choice  or 
short-answer  items  be  used? 

NAEP  resolution:    Only  writing  samples  should  be  used.  They 
increase  the  utility  of  results,  in  that  they  appear  .nore 
valid  and  each  sample  can  be  evaluated  from  a  variety  of 
perspectives . 

(2)  Should  student  performance  be  described  by  providing  detailed 
information  about  a  small  number  of  tasks  or  by  providing 
more  general  information  about  a  vide  variety  of  tasks? 

NAEP  resolution:    Tiy  to  do  both.    Increase  the  number  of 
writing  tasks  to  provide  letter  information  about  the  range 
of  tasks  students  can  perform,  but  retain  the  capability  to 
provide  detailed  information  about  some  tasks. 

(3)  What  kinds  of  writing  tasks  should  be  included  in  the 
assessment? 

NAEP  resolution:    Informative  tasks  ranging  from  note-taking 
to  analysis,  persuasive  tasks  ranging  for  advice-giving  to 


57 


7'. 


refutation,  and  literary  tasks  including  a  range  of 
narratives.    All  tasks  should  try  to  be  representative  of 
naturally  occurring  writing  situations  and  contexts  Doth  in 
and  out  of  school. 

(4)  Should  audience  always  be  specified  in  writing  tasks? 

HAEP  resolution:    Only  in  persuasive  writing  tasks. 
Informative  and  literary  tasks  may  or  may  not  have  audience 
specified.    Further,  any  specified  audience  must  appear 
natural,  not  artificial. 

(5)  How  can  NAEP  address  the  prior  knowledge  issue? 

NAEP  resolution:    Have  a  number  of  the  tasks  based  on  given 
information.    For  those  tasks  based  on  students'  own 
experiences,  avoid  biased  items  while  being  sure  to  maintain 
a  balanced  pool. 

(6)  How  can  NAEP  measure  students'  ability  to  engage  in  and 
manage  the  writing  process? 

NAEP  resolution:    Only  in  limited  ways;  perhaps  in  the  next 
assessmenl" . 

(These,  of  course,  are  not  the  only  issues  i^aised  during  the 
course  of  developing  the  fourth  national  assessment  of  writing.) 


3*1.5    NAEP's  Year  15  Writing  Assessment  Exercises 


Informational  Writing — Reporting 


From  Personal  Experience 


Ages  9,  13/r^rades  A,  8 


Pets:    Students  were  asked  to  write  a 
note  explaining  to  a  friend  how  to  care 
for  a  pet  while  they  were  away  on 
vacation,  including  where  to  find  the 
food,  how  often  to  feed  the  pet,  and  how 
much  food  to  give  the  pet. 


Age  17/Grade  11 


Job  Application:    Students  were  asked  to 
provide  a  brief  description  of  a 
desirable  summer  job  and  to  describe  the 
experiences  or  qualifications  they  had 
for  such  a  job . 


58 

75 


From  Given  Information 


Age  9/Grade  4 


Ages  9,  13,  17/Grades  4,  8,  11 


Ages  9,  13/Grades  4,  8 


Ages  9,  13,  17/Grades  4,  8,  11 


Plants:  Students  were  asked  to  summarize 
a  science  experiment  based  on  a  serie.3  oi 
pictures  of  different  stages  of  a  plant's 
growth. 

Appleby  House:    Ttudents  were  asked  to 
write  a  newspaper  article  based  on  notes 
provided  aboU  an  unusual  haunted  house. 

XYZ  Company:    Students  were  asked  to  send 
away  for  a  T-shirt  in  response  to  an 
advertisement . 

Dali:    Students  were  asked  to  describe  a 
surrealistic  painting  by  Salvador  Dali. 


*  *  * 


Informational  Writing — Analytic 


From  Personal  Experience 


Ages  9,  13,  17/Grades  4,  8,  11       Favorite  Music:    Students  were  asked  to 

describe  a  favorite  type  of  music  and 
explain  why  they  liked  it. 


From  Given  Information 


es  9,  13,  17/Grades  4,  8,  11 


Food  on  the  Frontier?    This  task  began 
with  a  passage  about  frontier  life; 
students  vere  then  asked  to  compare 
modern-day  food  with  frontier  food. 


•k    -k  -k 


Persuasive  Writing — Convincing  Others 


Age  9/Grade  4 


Spaceship:    Students  were  asked  to  argue 
♦"or  permitting  captives  from  outer  space 
to  return  home  rather  than  detaining  then 
for  scientific  study. 


59 


(  U 


Age  13/Grade  8 

Age  17/Grade  11 

Ages  13,  17/Grades  8,  11 

Ages  9,  13,  17/Grades  4,  8,  11 

Ages  9,  13,  17/Grades  4,  8,  11 


Dissecting  Frogs:    Students  were  asked  to 
discuss  and  support  their  views  on 
dissecting  frogs  in  science  class. 

Space  Program:    Students  were  asked  to 
take  a  stand  on  whether  funding  for  the 
space  program  should  be  cut,  and  why. 

Split  Session:    Students  were  asked  to 
write  a  letter  requesting  a  morning  or 
afternoon  school  session  and  explaining 
their  preference. 

Swimming  Pool:    Students  wero  asked  to 
write  a  letter  to  a  swimming  pool 
manager,  convincing  the  person  to  hire 
them  for  a  summer  job  at  the  pool. 

School  Rule:    Students  were  asked  to 
express  a  desire  for  changing  a  school 
rule  and  to  discuss  why. 


Persuasive  Writing — Refuting  an  Opposing  Position 


Age  9/Grade  4 


Ages  9,  13/Grades  4,  8 


Ages  13,  17/Grades  8,  11 


Age  17/Grade  11 


Aunt  May:    This  task  asked  students  to 
write  a  letter  convincing  Aunt  May  they 
are  old  enough  to  travel  alone  even 
though  Aunt  May  thinks  otherwise. 

Radio  Station:  Students  were  asked  to 
tive  reasons  why  their  class  should  be 
allowed  to  ^dsit  a  local  radio  station 
despite  the  ''•anager's  concerns. 

Recreation  Opportunity:    Students  were 
asked  to  take  a  stand  on  whether  a 
railroad  track  or  a  warehouse  should  be 
purchased  and  to  argue  on  the  basis  of 
possible  recreational  opportunities. 

Uncle:    Students  were  asked  to  write  a 
letter  to  an  uncle  convincing  him  to  lend 
his  car  so  the  student  could  visit  a 
friend.  Responses  needed  to  explain  the 
situation,  convince  the  uncle  that  the 
student  was  a  safe  driver,  and  to  do  so 
without  hurting  the  uncle's  feelings. 


60 


ERLC 


77 


Age  17/Grade  11  Bike  Lane:    Students  were  asked  to  take  a 

stand  on  whether  a  bike  lane  should  be 
installed  and  to  refute  specific  opposing 
views. 


•k   -k  -k 


Imaginative  Writing 


Ages  9,  13,  17/Grades  4,  8,  11 


Ages  9,  13,  17/Grades  '    8,  11 


Ages  9,  13,  17/Grades  4,  8,  11 


Hole  in  the  Box:     Students  were  given  a 
picture  of  a  box  with  a  hole  in  it  and  an 
eye  peeking  out;  they  were  asked  to 
imagine  themselves  in  the  picture  and 
then  to  describe  the  scene  and  how  they 
felt  about  what  was  going  on  around  them. 

Ghost  Story;    Students  were  asked  to 
write  a  good,  scary  ghost  story. 

Flashlight:  Students  were  asked  to  write 
a  story  about  adventures  with  a 


3.1.6    Background  Questions 

In  NAEP's  attempt  to  trace  the  effects  of  instructional  practices  on 
student  performance,  the  Year  15  writing  assessment  included  more 
non-cognitive  student  background  qiestions  than  ever  before.    These  focused 
on  the  students'  attitudes  toward  writing,  the  strategies  they  used  to 
complete  their  writing  assignments,  the  kinds  cf  writing  they  did  in 
school,  and         kinds  of  instruction  and  help  they  reported  that  they  had 
received  from  their  teachers. 

Because  this  is  an  era  in  which  schools  across  the  country  have 
increased  the  priority  they  place  on  writing  instruction,  both  in  the  kinds 
and  amounts  of  writing  students  are  asked  to  do  in  school  and  in  the  kind 
as  well  as  amount  of  help  they  receive  fiom  their  teachers,  it  seemed 
particularly  timely  to  describe  students'  perceptions  of  their 
instructional  environments  and  to  relate  these  to  writing  proficiency. 
Over  lOG  background  questions  specific  to  writing  were  included  at  each  age 
level.  Details  of  the  non-cognitive  assessmenr  are  included  in  Chapter  6. 


3.1.7    Evaluation  of  Student  Responses  to  Writing  asks 

Throughout  the  winter  of  1982-83,  conferences  were  held  with 
consultants  to  develop  and  refine  primary  trait  scoring  guides  and  document 
them  with  illustrative  sample  papers  from  the  field  tests.    The  primary 


61 


trait  scoring  methoJ  reflects  students'  success  in  accomplishing  the 
specific  informative,  persuasive,  or  imaginative  writing  task.  Primary 
trait  results  for  accomplishing  the  task  are  based  on  levels  of  success. 
Responses  are  either  rated  as  unsatisfactory,  minimal,  adequate,  or 
elaborated,  or  they  are  not  rated.    Although  criteria  for  the  categories 
are  specified  in  terms  of  each  writing  task,  a  general  explanation  of  these 
levels  follows. 


Lovels  of  Task  Accomplishment 

Not  rateable.    A  small  percentage  of  the  responses  were 
blank,  indecipherable,  totally  off  task,  or  contained  a  statement 
to  the  effect  that  the  student  did  not  know  how  to  do  the  task; 
these  responses  were  considered  not  rateable. 


Unsatisfactory.    Students  writing  papers  judged  as 
unsatisfactory  provided  very  abbreviated,  circular,  or  disjointed 
responses  that  did  not  represent  even  a  basic  beginning  toward 
addressing  the  writing  task. 


Minimal.     Students  writing  at  the  minimal  level  recognized 

ov^iiic   xjiL    axx    ui.    Liic    cxcMiciii.:>    iiccucu  i-umpxtSLt;    LUC    LdbK,    UU  L  UJLU 

not  manage  the  elements  well  enough  to  assure  the  purpose  of  the 
task  would  be  achieved. 


Adequate.    Adequate  responses  included  the  information  and 
ideas  critical  to  accomplishing  the  underlying  task  and  were 
considered  likely  to  be  effective  in  achieving  the  desired 
purpose. 


Elaborated.    Elaborated  responses  went  beyond  the  essential, 
reflecting  a  higher  level  of  coherence  and  providing  more  detail 
to  support  the  points  made. 


In  addition  to  being  evaluated  in  terms  of  task  accomplishment,  student 
responses  collected  to  measure  trends  in  performance  across  assessments 
were  rated  holistically  to  provide  an  overall  estimate  of  the  relative 
fluency  of  the  writing.    Readers  did  not  make  separate  judgments  about 
a  paper's  organization,  content,  grammar,  usage,  spelling,  and  punctuation, 
but  judged  the  overall  effect  of  the  paper.    In  contrast  to  the  evaluations 
for  task  accomplishment,  where  responses  to  the  same  task  written  by  more 
than  one  age  group  were  evaluated  against  the  same  specific  criteria, 
fluency  was  evaluated  by  rating  papers  on  general  impression  relative  to 
other  papers  from  the  same  age  group.  (For  example,  a  response  to  a  given 
task  written  by  a  9-year--old  was  ranked  in  comparison  to  the  responses 
written  by  other  9-year~olds  in  the  Year  15  as  well  as  previous 


62 


ERLC 


assessments.)  Each  response  was  given  a  rating  from  the  highest  to  the 
lowest  according  lo  six  levels  of  fluency,  with  six  being  highest. 


Overall  quality  measures  are  complemented  vith  information  about  syntax 
and  mechanics.    A  syntactic  analysis  involves  breaking  up  each  paper  into 
»»T-!rlts"  (an  independent  clause  and  all  of  its  modifying  words,  phrases, 
and  clauses)  and  examining  the  ways  in  which  writers  embed  information  in 
T-units  and  join  T-units  together.    A  mechanics  analysis  involves 
classifying  the  kinds  of  errors  writers  make  in  sentence  use,  punctuation, 
spelling  and  so  forth. 


3* 1*8    Writing  Exercise  Development  Consultants* 


Arthur  Applebee 
Stanford  University 
Stanford,  CA 

David  Bartholomae 
University  of  Pittsburgh 
Pittsburgh,  PA 

Elsa  Bartlett 

New  York  university  Medical  Center 
New  York,  NY 

Bill  Burns 

Boulder  High  School 

Boulder,  CO 

Courtney  Cazden 
Harvard  University 
Cambridge,  MA 

Jane  Christiansen 

National  Council  of  Teachers  of 

English 

Urbana,  XL 

Charles  Cooper 
University  of  California 
San  Diego,  CA 

John  Daly 

University  of  Texas 
Austin,  TX 


Vivian  Davis 
Tri-Ethnic  Committee 
Dallas,  TX 

Paul  Diehl 
University  of  Iowa 
Iowa  City,  lA 

Marjorie  Farmer 
Philadelpiiia  Public  Schools 
Philadelphia,  PA 

Ed  Folsom 

University  of  Iowa 
Iowa  City,  lA 

Donald  Graves 

University  of  New  Hampshire 
Durham,  NH 

Robert  Gundlach 
Northwestern  University 
Evanston,  IL 

Kris  Gutierrez 
Univer£*ity  of  Colorado 
Boulder,  CO 

Diane  Hernandez 

Lafayette  Elementary  School 

Lafayette,  CO 


*  Writing  Objectives,  1983-84  Assessment  (1982)  foi  a  list  of  consultants 
who  participated  in  developing  writing  objectives. 


Ann  Humes 

Southwest  Regional  Labs 
Los  Alamitos,  CA 

Don  Jones 

Jefferson  County  Schools 
Lakewood,  CO 

Kenneth  Kantor 
University  of  Georgia 
Athens,  GA 

Carl  Klaus 
University  of  Iowa 
Iowa  City,  lA 

Judith  Langer 
Bay  Area  Writing  Project 
University  of  CA 
Berkeley,  CA 

Richard  Lloyd-Jones 
University  of  Iowa 
Iowa  City,  lA 

Carol  Mathews 
Boulder  High  School 
Boulder,  CO 

George  McCulley 

Michigan  Technological  University 
Houghton,  MI 

Mary  Meier 

Eugene  School  District 
Eugene,  OR 

John  Mellon 
University  of  Illinois 
Chicago  Circle 
Chicago,  IL 

Patti  Mendes 
University  of  Colorado 
Boulder,  CO 

Jeff  Oliver 

Lincoln  Elementary  School 
Boulder,  CO 


Jesse  Perry 

San  Diego  Public  Schools 
San  Diego,  CA 

Anthony  Petrosky 
University  of  Pittsburgh 
Pittsburgh,  PA 

Edys  Quellmalz 

Center  for  the  Study  of  Evaluation 
University  of  California 
Los  Angeles,  On 

Sandra  Seale 

Cherry  Creek  High  School 
Englewood,  CO 

Mary  Ann  Shea 
University  of  Colorado 
Boulder,  CO 

Yvonne  Siu-Runyan 

Boulder  Valley  Education  Center 

Boulder,  CO 

Susan  Sowers 
Harvard  University 
Cambridge,  MA 

Gary  Stitt 

Jefferson  County  Schools 
Lakewood,  CO 

Lynn  Troyka 

City  University  of  New  York 
College  Rayside,  NY 

Tomas  Vallejos 
University  of  Minnesota 
riii'ineapolis ,  MN 

Faith  Waters 

Bucks  County  School  District 
Doylestown,  PA 

D.-^rnell  Williams 
Bishop  College 
Dallas,  TX 

John  Wood 

Juchem  Elementary  School 
Broomfield,  CO 


64 


81 


3.2    Developing  the  Year  15  Reading  Assessment 


Prior  to  the  Year  15  assessment,  NAEP  had  completed  three  assessments 
ot  reading  and  one  of  reading  and  literature  combined.    The  first 
assessrpents  of  reading  and  of  lite.^ature  were  in  Year  2  (1970-71).  Reading 
was  re-assessed  in  Yoar  6  (197A-75)  and  Year  11  (1979-80)  using  a  subset  of 
the  reading  .tems  for  the  lirst  assessment.    Literature  was  re-assessed  in 
Year  11  using  a  few  items  from  th^  first  literature  assessment.  Also 
during  Year  11  reading  and  literature  were  assesi^ed  together  using  a  new, 
combined  set  of  items.    This  document  summarizes  the  design  and  development 
of  the  Year  15  assessment  of  reading  including  revision  of  the  reading 
objectives,  ^^xercise  development,  field  testing  and  exercise  reviews. 


3.2.1    NAEP^s  Year  15  Reading  Objectives 

The  Year  15  reading  assessment  was  developed  to  address  foui  major 
objectives  (see  the  booklet,  Reading  Objectives,  1983-8A  Assessment 
1198A]).  The  first  objective.  Comprehends  \lha\  is  Read,  is  central  since 
every  other  objective  is  an  outgrowth  of  that  one.    It  includes 
comprehension  of  various  types  of  written  materials  read  for  a  variety  of 
particular  purposes.    The  second  objective.  Extends  Comprehension,  includes 
analyzing,  interpreting  and  evaluating  what  has  been  read. 

Good  readers  develop  a  variety  of  strategies  to  help  them  co?rprehend 
what  they  read.    The  third  objective.  Manages  the  Reading  Experience, 
addresses  how  a  reader  might  adopt  various  strategies  depending  upon  the 
characteristics  of  particular  passages,  the  reader's  knowledge  and 
experience  with  similar  materials,  and  the  reader's  purpose  for  reading. 
The  fourth  objective  is  Values  Reading. 


3.2.2    Reading  Objectives  Development 

NAEP  had  expeided  considerable  effort  developing  the  Reading  and 
Literature  Objectives,  1979-80  Assessment  which  was  publisFecTin  1980.  The 
eighteen-member  Reading/Literature  Advisory  Committee  guided  the 
development  of  those  objectives  and  approximately  130  consultants 
participated  in  the  development  process.    Given  the  extent  of  this  effort 
and  that  only  two  years  had  passed,  NAEP  explored  the  possibility  that  the 
Year  11  objectives  might  be  viable  for  the  Year  15  assessment.    To  review 
the  appropriateness  of  the  Year  1l  reading/literature  objectives  for  the 
Year  15  assessment,  the  Year  11  objectives  were  mailed  to  a  group  of 
reading  and  literature  experts  who  were  asked  to  comment  on  any  additions 
or  changes  they  felt  should  be  incc.porated  into  the  objectives. 

Second,  the  objectives  werj  discussed  by  a  smaller  group  of  consultants 
at  a  meeting  held  in  Denver  on  December  9-11,  ?982.    The  group  reviewed  the 
comnents  from  the  mail  review  (without  knowing  specific  authors),  aiscussed 
their  own  comments,  and  then  reached  consensus  regarding  their 
recommendations,    'i  .8  group  generally  appr*     i   )f  the  content  of  the 


65 


62 


objectives,  but  felt  the  objectives  should  be  rewritten  in  a  clearer  style 
moro  specifically,  the  group  made  six  recommendations: 

(3^  The  concept  of  proposition  needed  clarification  in  the 

comprehending  objective.    A  complete  rewrite  was  drafted  at 
this  meeting  and  the  word  "proposition"  was  eliminated.  The 
concept  of  amount  of  text,  characteristics  of  text,  prior 
knowledge  required  and  the  overall  interactive  nature  of  the 
comprehending  process  was  addressed.    The  decoding  aspect  of 
reading  comprehension  was  mentioned  only  as  a  prerequisite  and 
was  not  identified  as  an  area  to  assess. 

(2)  The  responding  objective  needed  to  be  expanded  to  include 
non-literary  texts.    A  complete  revrite  of  this  objective  also 
was  drafted  at  this  meeting.    An  effort  was  made  to  be  sure 
the  connection  between  the  comprehending  and  responding 
objectives  (all  part  of  the  same  process)  was  clear  at  the 
onset.    The  cognitive  and  emotional  aspects  of  the 
comprehending  and  responding  o*  ectives  ware  presented  as 
being  interrelated. 

(3)  The  valuing  objertive  needed  expansion  in  several  ways.  The 
descriptions  o-*"  .  "luing  reading  as  a  source  of  pleasure  and 
the  obtaining  self-understanding  needed  embellishment. 
Mention  of  the  value  of  reading  in  gaining  practical  knowledge 
was  seen  as  necessary.    A  section  discussing  the  fact  that 
valuing  reading  is  not  a  goal  for  all  cultural  groups  and  that 
different  cultural  groups  gain  value  from  reading  in  different 
ways  also  required  mention.    The  section  on  the  cultv**al  role 
of  reading  needed  to  be  more  explicit  about  freedom  to  publish 
and  freedom  of  access  to  publicbed  material. 

(4)  The  study  skills  ^bjec.ive  needed  to  be  edited. 

(5)  A  section  that  deals  with  issues  related  to  text  needed  to  be 
added.    ThiS  section  should  include  discussion  of  how  text 
structure  and  prior  knowledge  affect  reading,  and  issues  of 
text  selection.    Also,  the  variety  of  text  types  should  be 
discussed  with  a  mention  of  the  importance  of  practical 
reading.    The  section  would  not  imply  that  skills  of 
metacognitioii  should  be  assessed. 

(6)  The  cc.isultants  urged  that  a  major  section  dealing  with 
instructional  implications  be  added. 

These  suggestions  were  incorporated  into  a  new  draft  of  the 
reading/literature  objectives. 

This  draft  was  reviewed  by  additional  external  consultants  and  revised 
by  NAEP  Staff  and  consultants.    The  resultant  copy  was  subjected  to  a  lay 
review  by  persons  involved  and  interested  in  education.    The  comments  and 
suggestions  of  the  reviewers  were  addressed  by  staff  and  this  final  draft 


66 


83 


of  the  Reading  Objectives  was  subsequently  edited  and  published  by 
Educational  Testing  Service  in  1984 •    Reading  Objectives,  1983-8A 
Assessment  (198A)  lists  the  participants  in  the  objectives  development 
process. 


3.2.3    Reading  Exercise  Development 

Development  of  the  Year  15  reading  assessment  began  in  the  fall  of 
1982.    The  design  included  a  new  approach:    assessing  reading  within  the 
content  areas  of  literature,  science,  social  studies  and  a  few  "out  of 
school,"  or  media  and  functional  reading  materials  likely  to  be  encovntered 
by  students  in  their  day-to-day  experiences.    The  design  for  the  Year  15 
assessment  also  called  for  the  separate  assessment  of  reading  and  writing 
skills  as  well  as  joint  assessment  of  these  skills.    This  design  was  an 
expansion  of  the  model  usea  in  the  Year  11  reading  assessment  where 
studenis  read  a  passage,  answered  multiple-choice  comprehension  questions 
and  then  wrote  about  the  passage.    The  ne\  approach  seemed  particularly 
appropriate  for  measuring  Objective  II— analyzing,  interpreting,  anf" 
evaluating  what  has  been  read.    It  also  reflected  the  new  emphasis  on 
integrated  language  arts  instruction  and  assessment,  and  application  of 
language  skills  in  the  various  school  content  areas  and  everyday  life 
tasks. 


3.2.3.1    Passage  Selection 

Selecting  reading  passages  to  use  as  stimuli  for  the  reading 
comprehension  questions  was  the  first  step  of  the  development  process. 
Educators  and  reading  professionals  were  asked  to  select  passages  according 
to  guidelines  developed  from  NAEP's  experience  with  past  assessments.  The 
guidelines,  outlined  below,  dealt  with  the  use  of  the  stimulus  materials, 
their  length,  formats,  possible  sources  and  general  review  criteria.  The 
consultants  were  asked  to  concentrate  on  selecting  passages  in  their  areas 
of  expertise—literature,  science,  or  social  studies,  and  elementary  or 
secondary  r -  hool  levels.     (The  consultants  who  participated  in  i^-tn 
development  are  listed  in  Section  3.2.5) 


Passage  Selection  Guidelines 


Use  of  Stimulus  Materials.    It  is  essential  to  keep  in  mind 
that  the  materials  you  are  select '-ig  will  be  used  as  stimuli  for 
assessment  items.    It  is  important  that  they  contain  information, 
problems,  characters,  situations  that  will  supply  information  for 
developing  reading  comprehension  and  writing  items. 

Length  of  Stimulus  Materials.    The  materials  that  you  pick 
need  to  be  relatively  short.    We  do  not  want  to  have  the  reauing 
process  to  take  up  too  mvich  assessment  time.    A  guideline  is  to 
keep  all  materials  within  th3  limit  of  30  to  2000  words  vUh  very 


67 


few  that  are  long.    Ideally,  NAEP  needs  short  but  substantive 
stimuli . 

Formats  for  Stimulus  Materials.    In  the  vast  majority  of  the 
cases  NAEP  needs  normal  linear  text  material  for  stimuli,  i.e. 
material  presented  in  full  sentences  and  paragraphs.    However,  in 
a  few  cases  some  non-linear  written  material  may  be  appropriate, 
e.g.  charts,  graphs,  tables,  advertisements,  application  forms, 
and  so  fojth.  These  non-linear  materials  will  probably  be  most 
applicable  for  the  single  theme  modules  or  for  the  out-of-school 
materials  that  aire  explained  later. 

Sources  for  Stimulus  Materials.    Select  s     uli  from  existing 
published  material,  but  not  from  widely  distributed  curriculum 
series.    Good  sources  of  materials  are  supplementary  curricular 
materials  and  modules,  material  from  resource  books,  educational 
magazines  and  newspapers,  and  so  forth.    Ycu  may  use  excerpts  from 
materials.    However,  the  .iiaterial  you  select  must  stand  alone  as  a 
complete  piece. 

Criteria  for  Reviewing  Stimulus  Material.    Some  overall 
criteria  for  reviewing  stimulus  material  are  provided  in  Table  1 
(see  below).    These  are  the  same  criteria  that  we  will  be  using  as 
we  review  the  pool  of  materials  and  that  we  will  give  to  outside 
reviewers.    Keep  these  criteria  in  mind  throughout  your  search  for 
materials. 

Type  of  Stimulus  Materials.    The  literature  materials  may 
include  all  types  of  fiction  material,  e.g.  stories,  poems,  plays, 
etc.  as  well  as  nonfiction  materia?  that  has  a  literary  quality, 
e.g.  a  vivid  character  sketch  that  describes  a  real  person. 
Social  studies  passages  should  be  typical  of  materials  that 
students  read  as  a  part  of  their  social  studies  curriculum. 
However,  as  indicated  earlier,  do  not  include  material  from  widely 
used  curricular  material.    In  selecting  materials,  assume  that 
students  have  some  basic  content  knowledge;  do  not  limit  yourself 
to  the  very  simplest  instructional  materials.    Try  to  select  some 
stimulus  materials  that  reflect  the  special  interest  of  women, 
blacks,  Hispancs  and  other  minority  groups.    The  guidelines  for 
the  science  materials  are  similar  to  those  f^r  social  studies. 
Finally,  include  some  materials  which  are  more  typical  of 
out-of-school  reading.    Two  specific  types  of  material  are  sought: 
functional  and  media.    Functional  materials  may  include  svch 
things  as  instructions,  labels,  forms  and  so  forth,  that  studer^ts 
have  to  deal  with  in  their  everyday  functioning.    Media  materials 
are  those  that  are  typical  of  newspapers,  magazines,  posters, 
radio  and  television.    They  may  incluc'e  news  stories,  editorials, 
advertisements,  commercials,  and  so  forth. 


68 


ERIC 


83 


TABLE  1 

CRITERIA  FOR  REVIEWING  STIMULUS  MERIAL 

(1)  Naturally  occurring 

—  Stands  on  own,  not  segmented 

~  Reflects  materials  that  students  commonly  read  in  or  out 
of  school 

(2)  Interesting  for  target  group(s) 

—  Reflects  the  topics  and  genre  which  target  age  group(s) 
enjoys  reading 

(3)  Relevant  to  experience  of  target  age  groups(s) 

~  Reflects  settings  and  activities  which  students  from  all 
parts  of  the  country,  racial/ethnic  backgrounds  and 
economic  backgrounds  find  reasonably  familiar 

(4)  Appropriate  difficulty  level  for  target  age  group(s) 

—  Appropriate  v^ocabulary  level  and  readability  following 
general  guidelines  for  difficulty 

(5)  Meets  at  least  minimum  standards  of  writing 

(6)  Not  offensive  and/or  stereotypic 

—  Shows  vide  variety  (traditional  and  non-traditional)  of 
roles,  personalities,  etc. 

(7)  Enduring 

—  Relevant  for  future  assessments  (1990) 
(8;    Not  widely  read  in  regular  school  programs 


69 


8G 


3 • 2 . 3 • 2    Passage  Revie^ 


A  conference  to  review  and  select  from  the  pool  of  passages  received 
was  held  October  7-9,  1982.    The  group  of  reviewers  was  composed  of  reading 
and  content  area  experts  representing  different  geographical,  cultural  and 
ethnic  backgrounds.    The  meeting  began  with  a  general  orientation  and 
review  of  the  passage  selection  guidelines.    After  initial  orientation,  the 
group  broke  up  into  three  small  groups — literature,  science  and  social 
studies  groups.    Each  group  read  through  their  set  of  passages,  coming  up 
with  a  consensus  rating  of  excellent,  good,  fair  or  unacceptable  for  each 
passage.    A  final  pool  of  passages  was  created  from  those  rated  as  good  and 
excellent.    These  passages  were  then  reviewed  as  a  set,  determining  which 
particular  areas  were  weak  in  quality  or  representation.    All  three  groups 
felt  that  certain  areas  needed  more  coverage,  and  that  higher  quality 
materials  could  be  found.    Fucn  the  literature  ard  social  studies  groups 
felt  that  they  could  find  letter  passages  with  female  and  minority 
characters  and  themes.    The  social  studies  group  felt  that  global  issues 
and  urban  themes  were  under- represented  in  their  pool  of  passages.  The 
science  group  felt  that  better  passages  for  17-year-olds  were  needed.  The 
consultants  were  asked  to  select  new  passages  for  these  areas. 

3 • 2 . 3 • 3    Exercise  Writing 

In  October  of  1982 »  all  passages  selected  by  the  reviewers  were  sent  to 
educators  and  reading  and  measurement  specialists.    These  consultants  were 
provided  with  mstruccions,  guidelines  and  examples  for  writing  exercises 
to  accompany  the  passages;  an  overview  of  these  guidelines  is  presented 
below.    They  also  attended  one  of  two  meetings  (one  in  NAEP  offices  in 
Denver,  the  other  at  the  University  of  Illinois  at  Champaign)  for  training 
in  exercise  writing.    The  exercises  received  from  these  consultants  were 
then  reviewed,  edited  and  revised  by  NAEP  staff  members  and  an  external 
consultant.    An  item  documentation  system  was  developed  to  track  passages, 
items  and  review  ratings. 


Exercise  Development  Guidelines 

In  the  item  specifications,  we  have  provided  you  with  many 
suggested  item  stems,  for  example,  "What  is  the  main  idea  of  the 
story?"  and  many  examples  of  items  from  the  pres^ious  assessment. 
These  guidelines  are  meant  to  be  suggestive,  not  mandatory.  The 
nature  of  the  passage  will  often  dictate  different  ways  of  asking 
a  particular  question,  or  different  questions  altogether.    Let  the 
text  and  the  obvious  important  areas  of  meaning  guide  the  i"ypes  of 
questions  you  ask  and  the  way  you  ask  them. 

You  will  notice  in  the  item  specifications  and  examples  that 
we  sometimes  have  different  ways  of  asking  the  same  question  for 
different  ages.    For  example,  we  ask  9-year-olds  "How  does  the 
writer  make  the  story  sound?"  and  we  ask  13-  and  17-year-olds 
"What  is  the  tone  of  the  story?"    These  are  essentially  the  same 


70 


87 


question,  but  we  have  attempted  to  make  the  question  easier  for 
9-year-olds.    Please  make  these  same  kinds  of  modifications  when 
you  are  writing  questions  for  9-year-olds. 


Item  Difficulty.    It  is  very  difficult  to  draw  a  line  between 
making  item  dis tractors  plausible  and  discrimina  ing  between  gccd 
and  poor  comprehension  and  making  it3m  distractors  unnecessarily 
tricky  and  misleading.    There  are  two  minimum  criteria  to  follow 
and  after  that  it  is  a  matter  of  judgment.    First,  both  item  stems 
and  distractors  should  use  vocabulary  and  syntax  that  are  easy  to 
understand.    The  student  should  be  tested  on  his  or  her 
comprehension  of  the  passage  only,  not  his  or  her  comprehension  of 
the  question.    Second,  each  question  should  have  a  single  correct 
answer,  one  that  can  be  clearly  defended.    Beyond  this,  we  would 
prefer  to  err  on  the  side  of  making  items  difficult  rather  than 
easy,  especially  for  ages  13  and  17.    Our  cuirent  pool  of 
assessment  items  is  too  easy  at  the  older  ages.    This  problem  is 
partly  due  to  passages  that  are  easy  for  older  students  and  partly 
due  to  items  with  options  that  are  very  obviously  correct  or 
incorrect . 


Background  Knowledge.    Another  difficult  judgment  has  to  be 
made  regarding  the  amount  of  background  knowledge  that  is  required 
to  answer  a  question.    Background  knowledge  plays  an  important 
part  in  the  comprehension  process.    For  many  comprehension  tasks 
the  student  must  bring  some  of  his  or  her  own  knowledge  or 
experience  to  the  passage  in  order  to  answer  a  question.  While 
writing  comprehension  items,  you  need  to  keep  in  mind  two  possible 
extremes:    1)  items  that  are  based  on  background  knowledge  that 
all  students  have;  2)  items  that  are  based  on  background  knowledge 
that  very  few  students  have  or  that  only  students  from  a 
particular  group  have. 

With  respect  to  the  first  point,  you  should  not  write  an  item 
that  many  students  could  answer  without  reading  the  passage.  For 
example,  you  should  not  ask  a  question,  "Who  was  George 
Washington?"  even  though  tho  passage  was  about  Georg^i  Washington. 

With  respect  to  the  second  point,  you  should  not  write  items 
that  are  based  on  very  specialized  knowledge  or  on  knowledge  that 
only  special  group  of  stv  '^nts  might  have.    For  example,  you 
should  not  write  items  tl.at  require  detailed  knowledge  about  the 
Middle  Ages  or  detailed  knowledge  about  life  in  rural  areas. 

Sometimes  you  may  be  faced  with  a  situation  where  the  passage 
provides  information  that  might  conflict  with  inform;ition  that  the 
student  may  already  have  about  a  topic.    In  order  to  reduce 
confusion  you  may  preface  the  question  with  "According  to  the 
article...."    However,  we  sugges^  you  use  this  method  sparingly. 
If  the  item  you  write  produces  a  Lot.  of  dissonance  between  the 


71 


88 


information  in  the  text  and  general  knowledge,  it  probably  is  not 
a  good  item. 

Irrespective  of  the  problems  presented  by  background 
knowledge,  ^  w  should  not  avoid  writing  questions  that  require 
some  background  knowledge  altogether.    Background  knowledge  is  a 
very  important  req)»irement  for  higher  level  reading  comprehension 
and  cannot  be  avoided  in  the  writing  of  reading  comprehension 
items. 


Number  of  Types  of  Questions.    For  each  passage,  you  should 
write  questions  based  on  the  important  meanings  that  can  be 
derived  from  the  text.    The  number  of  questions  will  vary 
depending  upon  the  length  and  the  content  of  the  passage.  Some 
passages  are  very  short  and  only  present  a  few  ideas.    Some  are 
fairly  long  and  present  a  very  large  number  of  ideas.    As  a  guide, 
we  would  like  you  to  write  3  or  4  items  for  short  passages  (less 
than  200  words)  and  6  to  8  questions  for  longer  passages  (200  or 
more  words).    We  want  more  items  than  we  will  ultimately  use  in 
the  assessment  because  we  know  that  many  will  be  deleled  during 
the  field-testing  and  review  processes.    However,  don't  struggle 
to  meet  a  quota.    Write  as  many  items  as  you  think  it  takes  to 
cover  the  major  meanings  of  the  passage.    For  longer  passages  you 
will  only  be  able  to  write  items  that  sample  the  major  meanings  of 
the  passage. 

Please  document  each  question,  by  indicating  the  specific 
reading  comprehension  task  it  measures  and  by  providing  a 
rationale  for  why  that  task  is  an  important  one. 


Passage  Mo  lif ications.  The  passages  that  we  provide  you  wiih 
all  come  from  actual  published  documents.  We  will  gain  permission 
from  their  publishers  to  use  them  in  the  assessment. 

Although  NAEP  wishes  to  maintain  the  natural  quality  of  the 
passages,  slight  modifications  are  possible.  You  may  receive  a 
long  article  and  wish  to  shorten  it.  You  may  find  that  the  text 
needs  a  word  changed  or  an  introductory  sentence  added.  In  some 
cases,  you  may  need  to  add  an  advanced  organizer  to  a  passage  in 
order  to  give  the  student  some  background  information.  Howevei , 
we  suggest  you  use  advanced  organizers  sparingly. 


3.2.3.4    Bias  Review 

In  early  December,  1982,  the  reading  and  writing  passages  and  exercises 
were  sent  to  consultants  representing  various  constituent  groups  (e.g., 
minorities,  women's  groups,  large  urban  school  systems,  academicians). 
These  consultants  reviewed,  rated  and  made  recommendations  for  improvements 


72 


89 


in  the  passages  and  exercises  to  be  used  in  the  assessment.  The  guideli 
sent  to  these  reviewers  follow. 


General  Guidelines  for  Judging  Bias 

Items  should  reflect  settings  and  activities  that  are 
reasonably  familiar  to  students  from  all  regions  of  the  country, 
regardless  of  economic  or  racial/ethnic  background. 

Items  should  not  be  offensive  or  stereotypic  to  any  segment 
of  the  population.    Typical  kinds  of  stereotypic  descriptions  to 
be  eliminated  are  those  involving:    sex,  race,  culture,  ethnicity, 
older  persons  or  handicapped  persons. 

Other  stereotypic  descriptions  to  be  avoided  might  involve 
the  following:  social  roles,  psychological  traits,  physical 
appearance,  occupation,  life  style  or  language. 

Other  possible  reasons  for  classifying  passages  or  items  as 
unacceptable  are: 

-  language,  descriptions  or  situations  presented  which 
might  be  offensive  to  any  segment  of  the  population; 

-  passages  or  items  that  may  not  be  within  the  realm  of 
experience  of  students  from  particular  geographic 
areas  or  socio-economic  situations; 

-  words  that  may  not  have  a  common  meaning  for 
everyone;  and 

-  exceptionally  difficult  or  complex  vocabulary  or 
sentence  structure. 

riease  use  a  separate  rating  line  for  each  separate  passage, 
^f  a  passage  has  several  questions  accompanying  it,  please  rate 
all  itp-'s  for  that  passage  on  the  same  line,  indicating  by  part 
letter  any  that  are  rated  differently.  If  all  ite.ns  associated 
with  a  passage  are  acceptable,  no  separate  listings  by  part  are 
required. 


3.2.3.5    December  Field  Testing 

New  passages  and  exercises  were  packaged  to  be  field  tested  at  a 
variety  of  sites  across  the  country  in  early  December. 

The  passages  and  exercises  were  packaged  into  twenty  booklets,  seven 
for  9-year-olds,  ssven  for  13-year-olds,  and  six  for  17-year-olds.     For  all 
booklets,  the  exercises  were  ordered  to  vary  in  length,  type  of  passage, 
type  of  exercise  and  difficulty.    Booklets  began  with  easy  items,  longer 


73 


0  V 


and  more  difficult  items  were  placed  in  the  midn*'e  of  booklets  along  with 
open-ended  writing  exercises,  and  shorter,  easier  items  were  placed  at  the 
end. 


3.2.3.6    Exercise  Review 

A  meeting  was  held  January  27-28,  1983  to  review  results  obtained  from 
the  December  field  tests.    A  group  of  reading,  measurement  and  curriculum 
specialist  was  given  a  set  of  the  items  to  review  before  the  meeting,  as 
well  as  guidelines  for  review.    At  the  meeting,  the  group  was  provided  with 
field  test  results  and  comments  from  bias  reviewers. 

Consultants  were  first  given  an  orientation  to  the  reading  assessment 
and  the  criteria  for  exercise  review  and  selection.    They  then  worked  in 
small  group's,  again  concentrating  on  the  axreas  of  literature,  science  and 
social  studies.    They  rated  each  reading  passage  as  very  good,  good,  fair 
or  unacceptable.    Consultants  also  rated  each  question  as  accept  ble, 
acceptable  with  modification  (suggesting  specific  modifications)  or 
unacceptable. 

In  large  group  debriefing,  the  consultants  discussed  their  feelings 
about  the  pool  of  items  selected.    The  literature  group  felt  the  literature 
passapes  were  greatly  improved  over  those  used  in  the  Year  11  assessment, 
needing  only  an  addition  of  one  Hispanic-oriented  passage  for  9-year-olds. 
The  science  and  social  studies  groups  felt  that  the  passages  in  these  two 
subject  areas  did  not  represent  typical  textbook  material.    Also,  many  of 
the  science  and  social  studies  materials  lacked  overall  coherence — lacking 
major  premises,  supporting  examples  or  summaries.    Once  again  these 
consultants  were  asked  to  find  new  passages  to  remedy  these  problems. 


3.2.3.7    New  Selection  and  April  Field  Testing 

From  February  through  March,  a  small  number  of  new  science  and  social 
studies  p«>ssages  were  identified  and  items  developed.  Also, 
"*ecommer'^ations  from  the  passage  reviewers  were  incorporated  into  the 
earlier  items.    New  and  revised  items  were  then  subjected  to  a  review  by 
NAEP  staff  members  and  an  external  consultant.    Reading  items  were  packaged 
to  be  field  tested  April  2A-29,  1983.  Some  exercises  that  had  survived 
previous  reviews  were  dropped  or  changed  based  on  recommendations  by 
Educational  Testing  Service.    A  total  of  eighteen  test  booklets  were 
produced:     five  at  age  9,  seven  at  age  13  and  six  at  age  17.    A  small 
number  of  items  were  overlapped  at  t^'>  ages,  and  one  item  was  used  for  all 
three  ages.    The  items  were  packaged  in  booklets  with  acco...panying  answer 
sheets.    The  answer  sheets  provided  space  for  students  to  record  the  time 
after  they  completed  reading  a  passage  and  answering  questions  for  that 
passage. 

In  addition,  a  number  of  background  questions  for  students,  teachers, 
school  administrators  and  principals  were  developed  and  field  tested. 
These  questions  were  directed  at  variables  that  have  impact  on  student 


74 


81 


ERLC 


achievement.    More  specifically,  NAEP  field  tested  a  school  characteristics 
and  policy  questionnaire  concerning  use  of  principals'  thr.e,  incorporation 
of  results  of  school  effectiveness  research  into  the  school,  school  climate 
and  school  improvement-    The  teachers'  questionnaire  asked  about  resources, 
preparation  and  training,  instructional  objectives,  instructional 
practices,  materials,  evaluation  techniques  and  school  climate.  Students' 
questions  asked  about  activities  and  preferences,  their  study  and  library 
activities,  and  their  classroom  experiences. 

Staff  and  consultants  reviewed  the  field  test  results  and  reviewed  the 
entire  set  of  newly  aeveloped  and  previously  assessed  reading  items.  A 
selection  was  made  for  the  Year  15  assessment  and  the  background  materials, 
including  the  school  and  teacher  questionnaires,  were  prepared  for 
inclusion  in  the  clearance  package.    ECS  submitted  the  clearance  package 
containing  student  background  questions  for  reading  and  writing,  the 
English  teachers'  questionnaires,  and  the  school  characteristics  and  policy 
questionnaires  on  February  3,  1983. 


3.2.4    The  Year  15  Reading  Assessment  Exercises 

The  final  review  and  selection  of  the  items  for  the  Year  15  reading 
assessment  was  conducted  by  staff  and  consultants  at  Educational  Testing 
Service.    All  the  items  were  reviewed  by  subject  matter  specialists, 
measurement  experts,  and  editors  as  well  as  for  bias  according  to  the  ETS 
Standards  for  Quality  and  Fairness.    A  second  clearance  package  containing 
both  cognitive  and  non-cognitive  items  was  submitted  for  0MB  clearance. 
These  materials  became  the  basis  for  assembling  the  reading  blocks  for  the 
Year  15  assessment. 

The  Year  15  reading  assessment  materials  included  a  variety  of  tasks 
and  a  variety  of  stimulus  materials  and,  therefore,  represented  a  range  of 
topics  and  difficulty.    Students  were  asked  to  respond  to  multiple-choice 
questions,  to  answer  brief  open-ended  questions,  and  to  write  about  their 
reactions  to  what  they  read.    Short  and  long  passages,  graphically 
presented  mat'srials,  poems,  "real-world"  materials,  and  reference  materials 
were  all  included  in  the  assessments.    The  majority  of  the  materials  were 
drawn  from  those  developed  for  previous  assessments.    To  measure  trends 
across  time,  one  group  of  exercises  had  been  used  in  three  previous 
assessments  and  a  second  group  of  exercises  had  been  administered  once 
before,  in  Year  11.    To  contribute  to  a  more  complete  picture  of  current 
levels  of  reading  performance,  new  exercises  developed  specifically  for  the 
Year  15  assessment  were  also  included.    The  new  exercises  reflect  an 
increased  interest  in  students'  abilities  to  read  "across  the  curriculum*' 
and  therefore  include  topics  in  science,  the  social  sciences  and  hirtory, 
among  others. 

The  student  background  questions  included  asking  about  wliat  students 
read,  both  in  and  out  of  school;  how  often  students  read  various  kinds  of 
materials;  how  often  students  read  for  enjoyment;  use  of  the  library; 
understanding  the  value  of  reading;  and  the  reading  behavior  of  people  in 


75 


the  students'  homes.  Details  of  the  non-cognitive  assessment  are  included 
in  Chapter  6. 


The  Year  15  assessment  and  accompanying  background  questions  address 
the  following  issues: 

*  Has  students'  overall  reading  performance  changed  over  the 
last  13  years?    Over  the  last  9  or  4  years? 

*  Have  patterns  of  reading  performance  changed  over  the  same 
periods?    Do  these  patterns  vary  for  different  groups  of 
students?    Do  these  patterns  vary  among  students  who  report 
different  reading  activities  or  preferences? 

*  Does  reading  performance  vary  with  different  Winds  of  reading 
material — that  drawn  from  particulc.r  subject  areas,  perhaps, 
or  that  which  presents  a  particular  reading  task? 

*  Has  students'  ability  to  answer  questions  about  reference 
materials  and  study  skills  topics  changed  over  the  past  13 
years?    Over  the  past  9  to  4  years? 

*  Has  students'  ability  to  write  about  what  they  have  read 
changed  since  Y-jar  11?    If  so,  in  what  ways? 

*  Have  students'  evaluations  of  what  they  r^ad  changed  since 
Year  11? 


3.2.5    Reading  Exercise  Development  Consultants^ 


Ms.  Virginia  Allery 
Apple  Valley,  MN 

Dr,  Fernie  Baca 
School  of  Education 
University  of  Colorado 
Denver,  CO 

Ms,  Sharon  Branscome 
Sevierville,  TN 

Ms,  Opaline  Brice 
Englewood,  CA 


Dr.  Robin  Butterfield 
Northwest  Regional  Educational 
Laboratory 
Portland,  OR 

Dr.  Carita  Chapman 
Swift  Elementary  School 
Chicago,  XL 

Dr.  John  Chapman 

Michigan  Department  of  Education 
Haslet t,  MI 


*  See  Reading  Objectives,  1983-84  Assessment  (1984)  for  a  list  of  consultants 
who  participated  in  developing  reading  objectives. 


76 


ERIC 


93 


Ms.  Avon  Chrismore 

Center  for  the  Study  of  Reading 

Champaign,  IL 

Ms.  Nancy  Ciarleglio 
New  Haven,  CT 

Dr.  James  Connor 
Science  Education  ProgrjT) 
New  York  University 
New  York,  NY 

Ms.  Virginia  Cornue 

National  Organization  of  Women 

Nev  York,  NY 

James  Cunningham 

University  of  North  Carolina 

Chapel  Hill,  NC 

Dr.  Billie  Day 
Teacher 

Benjamin  Bannoker  Academic  High 
Washington,  DC 

Dr.  Phil  DiStefano 
School  of  Education 
University  of  Colorado 
Boulder,  CO 

Ms.  Margaret  Gallagher 

Center  tor  the  Study  of  Reading 

Champaign,  IL 

Ms.  Tee  Gallay 
Chicago,  IL 

Mr.  Pete  Garcia 
Dixon,  NM 

Dr.  Geneva  Gay 
Purdue  University 
West  Lafayette,  IN 

Dr.  Sandra  Gibbs 

National  Council  of  Teachers  of 

Urbana,  IL 

Ms.  Carol  Gibson 
National  Urban  League 
New  York,  NY 


Mr.  Gene  Goff,  Jr. 
Poca,  WA 

Roseann  Gonzalez 
University  of  Arizona 
Tucson,  AZ 

Dr.  Kris  Gutierrez 
Director  of  Academic  Affairs 
University  of  Colorado 

Ms.  Carol  Harner 
Littleton,  CO 

Dr.  Shirley  Munoz-Hernandez 
Columbia  University 
New  York,  NY 

Mr.  Jack  Holmquist 
York,  NE 

Dr.  Shu-In  Huang 
School    City  of  Thornton 
Thornton,  CO 

Peter  Johnston 
SUNY  at  Albany 
Albany,  NY 

Dr.  Henry  B.  Maloney 
Teacher 

Seaholm  High  School 
Birmingham,  MI 

Carole  L.  Mathews 
Boulder  Valley  Schools 
Boulder,  CO 

Greg  Morris 

Pittsburgh  Public  Schools 
Pittsburgh,  PA 

Ms.  Rosa  Casarez-Najera 
Stanford,  CA 

English  Taffy  Raphael 

Michigan  State  University 
East  Lansing,  MI 

Dr.  Linda  Reed 
Lakewood,  CO 


77 


James  Robinson 

Boulder  Valley  School  District 
Boulder,  CO 

Dr.  Mary  Budd  Rove 
College  of  Education 
University  of  Florida 
Gainesville,  FL 

Dr.  Peter  Sanders 
College  of  Education 
Wayne  State  University 
Detroit,  MI 

Ms.  Dorothy  Sibley 
Miami  Chapter  ASUW 
Miami,  FL 

Ms.  Lucille  Stillvell 
Bernalillo,  NM 

Dr,  Violet  Strahler 
Dayton  Public  Schools 
Dayton,  OH 

Dorothy  Strickland 
Columbia  University 
New  York.  NY 


Dr.  Barbara  Svaby 
School  of  Education 
University  of  Colorado 
Colorado  Springs,  CO 

Barbara  Taylor 
University  of  Minnesota 
Minneapolis,  MN 

Dr.  Robert  Tierney 

Center  for  the  Study  of  Reading 

University  of  Illinois  of 

Urbana/Champaign 
Champaign ,  IL 

Celeste  Woodley 

Boulder  Valley  School  District 
Boulder,  CO 

Ms.  Kathy  Yen 

San  rrancisco  Public  Schools 
San  Francisco,  CA 


78 


Chapter  A 

SAMPLE  SELECTION  AND  INSTRUMENT  COLLECTION 


Morris  H.  Hansen 
Benjamin  J.  Tepping 
Josefina  A.  Lago 
John  Burke 

Vestat,  Inc. 


The  sample  design  for  the  Year  15  NAEP  generally  follows  earlier 
designs  but  introduces  some  changes  to  serve  new  goals  and  increase 
efficiency.    One  innovation  mt.kes  it  possible  to  provide  estimates  for  the 
modal  grades  corresponding  to  ages  9,  13,  and  17.    Another  is  the 
introduction  of  a  balanced  incomplete  block  design  combined  with  a 
spiralled  procedure  for  assigning  tests  to  students.    This  change  serves 
important  analytical  purposes,  reduces  sampling  error,  and  facilitates 
administration.    A  third  design  innovation  includes  a  sample  of  teachers  of 
sampled  student's  lo  correlate  teacher  and  student  characteristics. 

The  sample  for  the  Year  15  NAEP  was  a  multi-stage  probability  sample, 
with  counties  or  groups  of  counties  serving  as  first-stage  sampling  units, 
elementary  and  secondary  schools  serving  as  second-stage  sampling  units, 
the  assignment  of  sessions  by  type  to  sampled  schools  serving  as  a  third 
stage  of  sampling,  and  the  selection  of  students  within  schools  and  their 
assignment  to  sessions  serving  as  the  fourth  stage  of  sampling. 

A  total  of  64  first-stage  units  was  included  in  the  sample,  and 
asses55ir,cnts  were  conducted  at  1,A65  schools.    Various  blocks  or  packages  of 
exercises  were  administered  in  these  schools  to  a  total  of  about  30,000 
students  in  each  of  the  three  grade/ages. 

To  facilitate  the  transition  to  a  new  organization  (the  Educational 
Testing  Service  (EfSJ  was  the  new  grantee  responsible  for  the  NAEP  project 
with  Vestat  as  the  survey  subcontractor)  the  sampla  of  PSHs  and  schools  was 
drawn  by  the  Research  Triangle  Institute  (RTI),  the  earlier  survey 
subcontractor.    These  samples  were  drawn  following  the  principles  and 
methods  developed  by  RTI,  and  simil-'r  to  tnose  of  recent  earlier 
assessments.      Procedures  more  or  less  similar  to  those  o£  prior 


^See  Final  Report  on  National  Assessment  of  Educational  Propcress; 
Sampling,  Weighting,  and  Quality  Check  Activities  for  Assessment  Ytar  13. 
June  1983  (RTI/1967/00-02F) . 


79 


assessments  were  used  for  subsequent  stages  of  sampling  as  well,  but  were 
modified  to  accommodate  new  goals  adopted  by  ETS. 

The  principal  new  goals  included  the  following: 

(a)  In  earlier  assessments,  the  students  sampled  and  assessed 
were  those  in  ages  9,  13,  and  17.    In  the  Year  15  assessment 
the  decision  was  made  to  draw  samples  to  assess  students  of 
ages  9,  13  and  17^,  and  students  of  the  corresponding  modal 
grades  4,  8,  and  11. 

(b)  In  earlier  assessments,  test  items  had  been  assembled  into 
various  packages.  The  same  package  of  items  was  administered 
to  all  students  in  a  session,  which  usually  consisted  of  a 
sample  of  about  20  students.    In  Year  15,  ETS  specified  and 
developed  a  new  procedure  in  which  exercises  were  grouped 
into  a  larger  number  of  smaller  blocks,  and  assembled  into 
test  booklets  in  a  balanced  incomplete  block  (BIB)  design. 
These  booklets  were  then  assigned  to  students  in  a  rotating 
or  "spiral''  design  so  that  different  booklets  were  assigned 
to  each  student  in  a  session. 

In  addition,  some  of  the  assessments  were  administered  as  in 
earlier  assessments,  to  provide  comparable  procedures  for 
measuring  change.    In  these,  all  students  were  administered 
the  same  "package"  of  items,  and  in  these  sessions  the 
questions  were  presented  orally  from  a  recorded  tape  as  well 
as  visually,  or  were  paced  by  a  tape  recording. 

(c)  A  questionnaire  was  obtained  for  a  sample  of  teachers  of 
sampled  students,  to  permit  correlation  of  teacher  and 
studenr  characteristics. 

(d)  Earlier  assessments  had  identified  and  excluded  from  the 
assessment  students  with  limited  English  proficiency  or 
certain  handicaps.    For  Year  15  such  students  were  again 
excluded,  but  a  questionnaire  was  obtained  for  a  sample  of 
them  to  allow  additional  description  and  analysis. 


^The  following  birthdate  ranges,  consistent  with  previous  assessments, 
were  used  to  define  the  Year  15  age  groups:    January-December  1974  for  Age 
9;  January-December  1970  for  Age  13;  and  October  1966-September  1967  for 
Age  17.    To  maintain  comparability  with  previous  assessments,  students  of 
each  age  group,  along  with  students  in  the  corres^  )nding  modal  grade,  were 
assessed  at  the  same  tinies  of  year  as  in  prior  assessments.    Times  of 
assessment  were:    October-December  for  Grade  8/Age  13;  January-February  for 
Grade  4/Age  9;  and  March-iMay  for  Grade  11/Age  17. 


80 


97 


Some  other  changes  were  made  in  an  effort  to  reduce  costs,  or  reduce 
sampling  variances  or  nonresponse  biases,  or  both: 

(e)  Assessments  were  administered  in  moderately  larger  session 
sizes  for  Year  15  than  in  earlier  assessments. 

(f)  Adjustments  for  nonresponse  were  made  session  by  session,  as 
in  the  past,  for  the  comparably  administered  taped 
assessments.  Somewhat  different  adjustments  for  nonresponse 
were  made  for  the  assessments  administered  by  the  new 
spiral  procedures. 

(g)  A  post-stratification  procedure  was  introduced  to  replace  the 
earlier  "smoothing"  procedure. 

A  brief  general  description  of  the  Year  15  survey  design  follows, 
including  some  discussion  of  the  new  features. 


4. 1    The  Sample  of  First-Stage  Units 

The  first-stage  sample  was  a  stratified  sample  of  64  primary  sampling 
units  (PSUs),  drawn  by  RTI  to  represent  the  50  states  and  the  District  of 
Columbia.    Each  PSU  consisted  of  a  county  or  a  group  of  counties.  Counties 
were  grouped  only  as  needed  to  cjchieve  a  specified  minimum  size  in  terms  of 
numbers  of  eligible  students.    The  number  of  PSUs  to  be  selected  for  the 
sample  and  their  minimum  size  were  specified  by 

Westat.    The  specified  total  of  64  PSUs  lo  be  selected  was  the  same  as  the 
number  used  for  the  Year  13  assessment,  and  was  deemed  minimal  but 
sufficient  to  control  the  PSU  contribution  to  variance  to  a  reasonable 
level.    Following  is  a  brief  description  of  procedures  followed  by  RTI  for 
defining,  stratifying  and  selecting  the  sample  of  PSUs\ 

(a)    Twenty  primary  strata  of  counties  were  defined,  using  1980 

Census  data,  based  on  four  geographic  regions  by  five  "Sample 
Description  of  Community"  (SDOC)  classes.    The  latter 
separately  identified  (1)  SMSA  counties  containing  at  least 
10,000  or  more  population  in  a  big  city  (a  city  of  200,000 
population  or  more),  (2)  remaining  counties  in  "big  city'^ 
SMSAs.  (3)  other  counties  containing  any  part  of  a  city  of 
25,000  or  more  population,  (4)  all  other  counties  not 
identified  as  extreme  rural,  and  (5)  counties  identified  as 
extreme  rural  (i.e.,  not  having  10,000  or  more  urban 
population,  non-zero  farm  employment,  and  classified  as 
extreme  rural  on  the  basis  of  an  occupational  index). 


For  a  detailed  description  of  the  selection  of  PSUs,  see  the  RTI  Final 
Report  (RTI/2589/03-00F),  Primary  Sample  for  Years  15^19  of  the  National 
Assessment  of  Educational  Progress. 


81 


(b)  Preliminary  measures  of  size  were  computed  for  each  county 
(frame  unit)  by  separately  estimating  the  enrollment  of  9-, 
13-,  and  17-year-old  students  in  elementary  and  secondary 
schools  for  each  county,  using  Q*  ility  Education  Data,  Inc. 
(QED)    data  on  school  grade-rangj  and  total  enrollment,  and 
using    prediction  formulas  developed  by  RTI  on  the  basis  of 
prior  experience.    The  preliminary  measure  of  size  was  the 
average  enrollment  of  the  three  age  classes. 

(c)  Adjusted  measures  of  size  were  computed  by  doubling  the 
preliminary  measures  of  size  for  counties  identified  as 
extreme  rural  and  for  low  socio-economic  status  (Low-SES) 
tracts  of  "big"  cities.    (Low-SES  Census  tracts  were 
identified  within  the  central  big  cities  in  the  counties 
included  in  SDOC  class  1,  based  on  an  index  of  SES  computed 
for  each  Census  tract.) 

(d)  The  number  of  PSUs  to  be  sampled  was  allocated  to  the  20 
primary  strata,  approximately  in  propor  ion  to  the  adjusted 
measures  of  size. 

(e)  Within  the  20  primary  strata,  PSUs  consisting  of  one  or  more 
counties  were  defined  within  states  (with  minor  exceptions), 
each  PSU  to  include  a  minimum  adjusted  measure  of  size  of 
1,000.    The  PSUs  within  each  primary  stratum  were  then 
ordered  by  state  (after  states  within  a  region  were  ordered 
in  a  serpentine  manner),  and  by  percent  minority  within  state 
(with  reverse  ordering  in  successive  states). 

(f)  PSUs  were  selected  by  a  sequential  zone  selection  algorithm 
developed  by  Chromy  (1979).    For  small  PSUs  (i.e.,  those  with 
adjusted  measures  of  size  smaller  than  the  zoning  interval), 
the  selections  were  made  without  replacement,  and  with 
probability  of  selection  proportional  to  the  adjusted 
measures  of  size.    For  such  PSUs  the  use  of  the  algorithm 
made  the  PSU  sampling  weight  inversely  proportional  to  the 
adjusted  measure  of  size  of  the  PSU.    Larger  PSUs  could  be 
selected  more  than  once;  in  fact,  two  large  PSUs  were  each 
selec.  d  twice. 


Quality  Education  Data,  Inc.  (QED)  maintains  and  updates  annually 
lists  of  schools  showing  grade  span,  total  enrollment,  school  district, 
principal's  name  and  other  information  for  each  school.    The  initial  data 
provided  by  QED  were  evaluated  against  Census  school-enrollment  data  by 
RTI,  which  led  to  some  corrections  of  the  QED  file,  made  before  the  data 
were  used  in  computing  measures  of  size  for  sampling. 


82 


99 


A . 2    The  Initial  Sample  of  Schools 


An  initial  sample  of  1,682  schools  was  selected  from  the  64  primary 
sampling  units,  with  the  selections  carried  out  independently  for  the  three 
age  classes.    A  total  of  700  schools  was  selected  for  Age  9  (and  Grade  4), 
588  for  Age  13  (and  Grade  8),  and  394  for  Age  17  (and  Grade  11)  •  However, 
some  schools  contained  eligibles  for  two  or  more  of  the  age  classes  and 
were  selected  more  than  once  so  that  a  total  of  1,587  distinct  schools  was 
selected.    Enough  schools  were  selected  within  an  age  class  in  each  PSU  to 
yield  the  desired  sample  size  of  students,  with  a  reserve  to  allow  for  some 
ineligible  schools  and  for  some  non-participation  of  schools,  based  on  Year 
13  experience. 

Often,  a  relatively  efficient  procedure  is  to  draw  the  sample  with 
varying  probabilities  at  the  various  stages  of  sampling,  but  such  that  the 
overall  probability  of  selection  of  a  final  unit  in  the  sample  (in  this 
case  a  student  selected  to  take  a  particular  type  of  assessment  booklet)  is 
the  same  for  each  student.    With  some  exceptions  in  which  oversampling  or 
undersampling  was  done  by  design,  this  was  a  goal  in  the  NAEP  sample,  and 
it  affected  design  decisions  for  sampling  PSUs  and  schools  as  well  as  later 
stages  of  sampling. 

To  control  costs,  the  sample  of  schools  was  selected  to  allow  a  maximum 
of  about  200  age  or  grade  eligibles  to  be  invited  to  assessment  sessions  in 
a  school  for  Grade  4/Age  9  and  up  to  about  250  age  or  grade  eligibles  for 
Grade  8/Age  13  and  Grade  11/Age  17.    While  these  specifications  allow 
relatively  large  samples  of  students  from  some  individual  schools,  the 
average  number  of  students  assessed  per  school  was  well  below  the  maximum. 
Moreover,  only  a  small  fraction  of  students  assessed  in  a  school  is 
assessed  for  a  given  block  of  exercises.    It  was  recognized  that  variances 
would  be  increased  by  allowing  maximum  cluster  sizes  up  to  these  levels  but 
perhaps  not  unduly  in  relation  to  costs. 

After  initial  study,  it  was  estimated  that  the  number  of  students  in  a 
school  that  were  eligible  by  either  age  or  modal  grade  would  average 
roughly  1.3  times  the  number  of  age  eligibles.    This  would  vary  by  age 
class  and  from  school  to  school,  but  not  widely;  the  number  of  age 
eligibles  in  a  school  would  still  provide  satisfactory  measures  of  size  for 
use  in  sample  selection. 

As  described  below,  varying  but  roughly  equal  measures  of  size  were 
assigned  to  those  schools  containing  estimated  age-eligible  students 
ranging  from  20  to  160  (for  Age  9)  or  to  200  (for  Ages  13  and  17).  Schools 
witn  less  than  20  estimated  age  eligibles  were  selected  with  lower 
probabilities,  and  schools  above  the  indicated  maximum  size  were  selected 
with  probabilities  proportional  to  the  estimated  numbers  of  age-eligible 
students  (with  approximately  constant  numbers  of  students  to  be  subsampled 
from  them). 


^Three  schools  were  selected  twice  for  Age  17  and  Grade  11. 


83 


ERIC 


With  the  adoption  of  these  general  specifications,  the  sampling  of 
schools  ly  RTI  proceeded  approximately  as  follows; 


(a)    The  estimated  number  of  age  eligibles,      ,  was  computed  for 
school  i,  using  QED  information  for  school  year  1982-83,  The 
number  in  each  grade  was  estimated  by  dividing  total 
enrollment  by  the  number  of  grades;  the  number  of 
age  eligi^les  was  estimated  by  applying  the  RTI  prediction 
formulas. 


(b)    For  the  "big-city"  PSUs 

(i)  An  SES  index  was  assigned  to  each  school  (based  on 
employment,  unemployment,  occupational,  and  income 
data  from  the  1980  Census  for  each  Census  tract, 
and  by  approximately  matching  the  zip  codes  to  the 
Census  tracts). 

(ii)  Schools  were  classified  as  Low-SES  (Stratum  1),  and 
Other  (Stratum  2).    After  establishing  a  cutoff  for 
the  SES  index  to  define  the  two  strata,  the  schools 
were  ordered  by  estimated  number  of  age  eligibles 
in  ascending  order  in  Stratum  1  and  descending 
order  in  Stratum  2.    For  other  PSUs  the  schools 
were  ordered  by  size. 


(c)    A  preliminary  measure  of  size,  s^  was  assigned  to  each 

school,  based  on  the  estimated  number  of  age^eligibles  E .  , 

illustrated  as  follows  for  Age  9,  for  which  n  =  20  is  the 
planned  full-session  size: 

(i)  If  school  i  had  six  or  fewer  estimated 
age  eligibles,  s|  =  .25; 

(ii)  If  school  i  had  seven  to  nineteen  estimated 
age  eligibles, 

s'  =  E./20; 
1  1 

(iii)  If  school  i  had  20  or  more  age  eligibles  but  less 
than  160, 

s:  = 

20k 

1 


^See  Section  3.1.4  of  School  Sampling  Procedure  for  Year  15  of  the 
National  Assessment  of  Educational  Progress,  September  1983 
(RTI/2589/02-00F). 

84 


101 


where  k.  is  the  number  ot  sessions  of  20  that  can 
be  accommodated  by  E  ;  antl 


(iv)  If  school  i  had  160  or  more  age  eligibles 

s:  =  i 

'  160 

(d)  A  final  measure  of  size,  s.   ,  was  computed  for  each  school  by 
doubling  the  preliminary  measure  of  size  for  those  schools  in 
"big-city"  PSUs  that  had  been  assigned  to  the  low-SES 
stratum,  and  by  using  s^  =  s!  for  all  other  schools. 

(Note  that  the  extreme  rural  PSUs  were  already  oversampled  by 
a  factor  of  2,  which  had  the  effect  of  doubling  the  school 
sample  in  these.) 

(e)  The  number  of  schools  to  be  selected  in  an  age  class  was 
computed  separately  for  each  PSU  to  yield  approximately  the 
desired  number  of  students  to  be  tested,  after  making 
approximate  allowance  for  school  and  student  nonresponse  and 
for  ineligible  schools.    The  number  of  schools  to  be 
selected,  t,  is 


nm 


is  the  number  of  students  per  full  age  session 
(e.g.,  20  for  Age  9); 

is  the  number  of  full  age-eligible  sessions 
assigned  to  the  PSU; 

Es:k 

X  X 

1 

that  is,  the  weighted  average  of  the  k    (thi^  number 
of  age-eligible  sessions  available  in  school  i,  as 
used  in  computing  the  measures  of  size);  and 

s'      is  defined  above. 


(f)    The  t  schools  were  then  selected  in  the  PSU  for  the  age  class 
by  sampling  with  probabilities  proportionate  to  the  measures 
of  size,  s^.     It  was  recognized  that  a  school  might  be 
selected  twice  for  the  same  age  class  by  this  procedure,  and 


85 


ro2 


where 

n 
m 


thus  (to  avoid  administering  more  than  ten  sessions  in  a 
school)  it  might  be  necessary  to  transfer  sessions  to  another 
sampled  school.     (Actually,  only  three  schools  were  selected 
twice,  and  these  were  for  Age  17.) 

A  detailed  description  of  the  initial  selection  of  the  sample  schools 
is  gi^en  in  the  RTI  final  report  cited  previously. 

^.3    Updating  the  School  Sample 

ETS  made  the  initial  contacts  with  sampled  school  districts  to  obtain 
participation.    The  districts  were  then  requested  by  Westat  to  identify 
schools  that  were  new  since  the  time  of  the  QED  list,  or  schools  with 
changes  in  grade  range  or  major  changes  in  enrollment.    These  were  given 
appropriate  chances  to  be  in  the  sample  using  probability-sampling 
procedures.    Also,  the  sample  was  supplemented  in  a  few  PSUs  where  losses 
due  to  closed  schools  or  other  changes  left  too  few  schools  in  the  sample. 
A  Principal  Questionnaire  showing  updated  grade  and  enrollment  figures  and 
certain  other  school  characteristics  was  requested  from  each  of  the 
cooperating  schools  prior  to  the  assessment. 

Some  substitutions  were  made,  as  needed  and  to  the  extent  feasible,  for 
non-cooperating  schools.    Generally,  substitutions  were  made  for  schools 
refusing  to  participate  in  the  assessments  if  their  omissions  would  result 
in  an  unacceptable  balance  in  school  type  among  the  schools  assessed, 
according  to  the  size  of  the  school  and  the  socio-economic  status  of  the 
community,  or  would  result  in  a  substantial  reduction  in  the  number  of 
students  tested.    In  general,  substitution  of  schools  was  made  within  the 
same  PSU,  but  in  a  few  cases  losses  in  one  PSU  were  compensated  for  by 
additional  assessments  in  the  sampled  schools  in  another  PSU.    In  three 
cases  substitute  schools  were  obtained  from  a  neighboring  and  similar 
county  (not  a  member  of  the  primary  sample  of  PSUs). 

Table  4(1)  summarizes  the  selection  and  participation  of  schools.  The 
cooperation  rates  obtained  were  approximately  the  same  those  as  obtained 
for  the  Year  13  NAEP  (an  overall  raie  of  88.1  for  Year  15  and  of  88.0  for 
Year  13). 


4.4    The  Assignment  of  Sessions  to  Schools,  by  Type 

The  assignment  of  sessions  to  schools  was  done  separately  by  the  two 
types  of  sessions,  designated  "spiral"  and  "tape." 

As  discussed  in  Chapter  5,  the  balanced  incomplete  block  (BIB)  design 
together  with  spiralling  (or  interspersing)  the  assessment  booklets  was 
introduced  for  the  first  time  in  Year  15.    This  made  it  possible  to 
correlate  results  for  all  pairs  of  exercises  in  the  BIB  design.  The 
exercises  were  divided  into  blocks  of  items,  each  block  also  containing 
some  background  questions.    The  blocks  were  assembled  into  63  test 


86 


103 


Table  A(l) 

Summary  of  NAEP  Year  15  School  Participation  Experience 

Grade  A/  Grade  8/  Grade  11/  Total 

Age  9  Age  13  Age  17  Sample 

Initially  selected  schools  700  588  394  1,682 

Supplemental  selections  17  2  1  20 

New  schools  added  to  sample  2  1  3 

Total  original  sample  719  591  395  1,705 

Out-of-range  or  closed  (A)  15  12  17  44 

No  eiigibles  enrolled  (B)  17  64  17  98 

District  refused  (C)  61  42  40  143 

School  refused  (D)  19  14  21  54 

Coopeiating  -  No  student 

sample  (F)  0  4  15 

Cooperating  -  Assessment 

conducted  (E)  607  455  299  1,361 

Cooperation  rate  =     B+E+F  88.6  90.3  83.9  88.1 

B+C+D+E+F 

(Year  13)  (88.0)  (89.2)  (86.5)  (88.0) 

ReplaceiT.ent  for  refusals*  67  28  34  129 

Out-of-range  or  closed  3  0  0  3 

No  eiigibles  enrolled  5  3  19 

Refusals  5  2  6  13 

Assessment  conducted  54  23  27  104 

Total  contacted  schools  786  619  429  1,834 

Total  assessments  conducted  661  478  326  1,465 


*Includes  schools  added  through  the  partial  PSU  replacement  procedure  and 
school-by-school  substitution 


87 

104 


booklets,  most  containing  three  blocks  as  well  as  a  set  of  background 
questions  common  to  all  the  booklets,  so  that  each  block  occurred  in  the 
same  number  of  booklets  and  each  pair  of  blocks  occurred  in  the  same  number 
of  booklets.    As  a  result,  it  was  expected  that  each  block  of  items  would 
be  administered  to  about  2,000  students  in  each  grade/age  and  each  pair  of 
blocks  would  be  administered  to  about  200  students  in  each  grade/age.  The 
booklets  were  assembled  systematically  into  packages,  arranged  so  that  the 
starting  booklet  varied  from  session  to  session. 

The  tape  design  used  an  administration  procedure  like  that  of  earlier 
NAEP  assessments  so  as  to  provide  direct  comparison  with  the  results  of 
earlier  assessments  and  to  calibrate  the  results  of  the  spiral  design.  The 
administration  of  each  booklet  used  a  tape  recording,  as  in  earlier 
assessments.    The  specified  sample  size  was  such  that  each 
tape-administered  booklet  was  expected  to  be  administered  to  about  1,250 
students. 

A  preliminary  allocation  of  sessions  was  made  to  the  sampled  schools 
based  on  the  QED  1982-83  information  on  enrollment  and  grade  range  for  use 
in  making  initial  arrangements  with  the  schools.    These  were  revised  later 
on  the  basis  of  the  Principal  Questionnaire  which  provided  enrollment  by 
grade  and  information  on  SES  status  and  minority  enrollment  for  the  school. 

For  the  purpose  of  this  allocation,  small  schools  were  clustered  with 
others  in  the  sample  so  that  there  was  an  estimated  minimum  of  eight  (and 
usually  more)  age-eligible  students  in  each  school  cluster.    The  allocation 
of  tape  sessions  was  made  first,  by  ordering  the  school  clusters  by  an 
index  of  socio-economic  status  (based  on  the  information  provided  in  the 
Principal  Questionnaire)  and  by  size,  then  selecting  a  systematic  sample  of 
four  school  clusters  with  probability  approximately  proportional  to  the 
estimated  number  of  age-eligible  students  in  the  school  cluster.    The  next 
step  was  to  assign  one  spiral  session  to  each  school  cluster  not  selected 
for  a  tape  session  and  to  allocate  the  balance  of  the  spiral  sessions 
specified  for  the  PSU  to  school  clusters  approximately  proportionate  to  the 
estimated  number  of  students  (eligible  by  age  or  grade)  that  would  be 
available  after  the  initial  assignment  of  tape  and  spiral  sessions. 
Details  of  the  allocation  appear  in  the  Report  on  Sample  Selection, 
Weighting  and  Variance  Estimation;    NAEP^ear  15  (Lago,  Burke,  Tepping,  & 
Hansen,  1985).  " 


4.5    The  Samples  of  Students 

A  total  of  about  29,300  students  was  to  be  tested  for  each  grade/age, 
including  students  for  the  corresponding  modal  grade.    This  means  an 
average  of  about  460  completed  assessments  per  PSU  for  each  grade/age.  On 
the  basis  of  the  experience  in  Year  13,  conservative  estiraates  were  made  of 
the  proportion  of  students  that  would  be  excluded  from  t»esting  because  of 
language  or  other  disability  and  of  the  proportion  of  students  invited  for 
assessment  that  would  actually  complete  the  assigned  test.    These  estimates 
led  to  the  determination  of  the  sampling  rate  to  be  applied  in  each  sample 
school.    Since  the  estimates  were  conservative,  the  number  of  students 


88 


105 


assessed  was  expected  to  exceed  f.he  target.    For  Grade  4/Age  9,  31,579 
students  were  assessed;  for  Grade  8/Age  13,  33,563  students  were  assessed; 
and  for  Grade  11/Age  17,  35,070  students  were  assessed, 

A  Student  Listing  Form  (SLF)  was  filled  out  for  each  participating 
school;  all  enrolled  students  of  the  specified  age  (9,  13  or  17)  and  all 
others  in  the  corresponding  modal  grade  (4,  8  or  11)  were  to  be  entered  on 
the  SLF  in  any  order  convenient  for  the  school.    In  a  few  instances  for 
very  large  schools,  only  a  sample  of  students  was  listed  on  the  SLF.  The 
SLF  was  ordinarily  prepared  by  the  school,  but  Westat  staff  assisted  cr 
prepared  the  form  when  desirable  or  necessary. 

After  the  SLF  vas  completed  the  selection  of  sar^pie  students  was 
carried  out  brietly  as  follows: 

(a)  A  computer  generated  listing  of  sample  SLF  line  numbers  was 
prepared  in  advance  by  Wescat  to  identify  the  students  to  be 
included  in  the  sample.    When  the  number  of  students  listed 
on  the  SLF  differed  widely  from  the  anticipated  number, 
communication  was  handled  by  telephone  and  a  new  set  of 
sample  line  numbers  vas  supplied. 

(b)  The  sample  line  numbers  also  identified  the  type  of  session, 
spiral  or  tape,  to  which  a  sampled  student  was  assigned. 

(c)  The  names  of  students  selected  for  the  sample  were  reviewed 
by  appropriate  school  personnel  to  identify  sampled  students 
who  for  language  reasons  or  certain  types  of  handicaps  would 
be  unable  to  take  the  test  and  thus  should  be  excluded. 

Makeup  sessions  were  scheduled  in  schools  in  which  the  students 
assessed  constituted  less  than  75  percent  of  the  selected  sample  In  the 
case  of  spiral  sessions,  less  than  50  percent  in  the  case  of  tape  sessions 
for  9-year-olds  and  13-year-olds,  and  less  than  75  percent  in  the  case  of 
17-year-olds-    Very  few  makeup  sessions  were  necessary  for  9-  and 
13-year-olds.    For  the  17-year-olds,  makeup  sessions  were  conducted  in 
about  20  percent  of  the  sample  schools. 


4.6    The  Sample  of  Excluded  Students 

The  Ye^r  15  assessment,  as  in  previous  assessments,  excluded  students 
who  were  functionally  handicapped  to  the  extent  that  they  could  not 
participate  in  the  assessment  as  it  was  normally  conducted.  Specific 
groups  excluded  were; 

(1)  students  with  limited  English  proficiency; 

(2)  students  identified  as  having  behav  oral  disorders;  and 

(3)  students  physically  or  mentally  handicapped,  including 
Educable  Mentally  Retarded  (EMR),  i.i  such  a  way  that  they 


89 


106 


could  not  respond  to  NAEP  exercises  as  they  were  normally 
administered. 

In  Year  15  a  sample  of  excluded  students  was  drawn  and  data  collected 
about  them.    In  most  cases,  students  to  be  excluded  from  assessment  were 
identified  before  sampling  but  were  sampled  at  the  same  rates  as  any  other 
eligible  student.    In  other  cases,  excluded  students  were  identified  only 
for  students  selected  for  the  sample. 

For  each  sampled  excluded  student,  an  Excluded  Student  Questionnaire, 
which  focused  on  the  natuie  of  the  student'?:  problem  and  the  school's 
approach  to  handling  it,  was  filled  out  by  school  personnel.    This  data 
collection  effort  for  excluded  students  was  a  new  feature  of  the  Year  15 
assessment  permitting  national  estimates  of  this  subgroup  of  age-  and 
grade-eligible  students.    Table  A(2)  shows  the  distribution  of  excluded 
students  by  reason  for  exclusion  for  tfie  three  ^rade/ages. 

A. 7    Student  Participation  Resu}tc 

The  NAEP  sample  was  designed  to  yield  a  target  number  of  spiral 
assessment  and  of  each  of  the  four  tape  assessments.    Table  A(3)  compares 
the  target  assessments  to  the  actual  assessments  for  the  three  grade/ages. 

As  indicated  previously,  the  allocation  of  sessions  to  schools  and 
sampling  rates  within  schools  were  based  on  the  Year  13  proportion  of  ex- 
cluded students  identified  and  student  participation  rate,  and  the  Year  15 
target  number  of  completed  assessments.  Tables  and  4(5)  compare  the 

Year  13  and  Year  15  proportion  of  excluded  students  and  student  participa- 
tion rates,  respectively.  As  shown,  the  student  participation  rates  in  Year 
15  were  about  2  percent  higher  (for  Grade  A/Age  9  and  Grade  8/ Age  13)  -^nd  8 
percent  higher  (for  Grade  11/Age  17)  than  the  participation  rates  for  cor- 
responding age  classes  in  Year  13.  Also,  the  losses  due  to  excluded  stu- 
dents were  smaller  for  Grade  A/Age  9  and  Grade  8/Age  13  in  Year  15.    As  a 
result,  and  because  some  reserves  were  provided  for  in  allocating  the  sam- 
ple to  allow  for  the  possibility  of  greater  1: sses  than  anticipated  on  the 
basis  of  Year  13  experience,  the  Year  15  actUv>l  assessments  shown  in  Table 
A(5)  were  considerably  higher  than  the  target  of  29,267  assessments  per 
grade/age. 

A. 8    The  Associated  Teacher-Student  Sample 

In  addition  to  the  student  data  collection  effort,  NAEP  also  collected 
data  on  a  sample  of  English  or  language  arts  teachers  who  were  identified 
as  the  principal  such  teacher  of  a  subsample  of  one  or  more  of  the 
grade/age-eligible  students  in  the  spiral  sample.    The  objective  of  the 
survey  was  to  collect  for  analysis  data  that  involve  the  characteristics  of 
a  student's  teacher. 

The  teachers  who  participated  in  the  teacher  survey  were  selected  as 
follows:    From  those  students  selected  for  spiral  sessions  in  a  school,  a 
subsample  of  students  was  selected  equal  to  the  number  of  spiral  sessions 


90 


107 


Table  4(2) 


Weighted  and  Unweighted  Distribution  of  Excluded  Students, 
by  Reason  for  Exclusion  and  Grade/Age 


Reason 

A  Physical  oi  niental  handicap 

B  Behavioral  disorder 

C  Handicap  and  limited  English 

proficiency 
D  Limited  proficiency  in  English 

All  re^'sons 


Grade  4/Age  9 


Unweighted 


Weighted 


Count 

Percent 

Count 

Percent 

761 

5/^ 

91,538 

53 

102 

7 

11,^88 

7 

102 

7 

11,^88 

7 

^53 

32 

56,922 

33 

1,^18* 

100 

171,^36 

100 

Reason 

A  Physical  or  mental  handicap 

B  Behavioral  disorder 

C  Handicap  and  limited  English 

proficiency 
D  Limited  proficiency  in  English 

All  reasons 


Reason 

A  Physical  or  mental  handicap 

B  Behavioral  disorder 

C  Handicap  and  limited  English 

proficiency 
D  Limited  proficiency  in  English 

All  reasons 


Grade  8/Age  13 
Unweighteo  Weighted 


Count 

I ercent 

Count 

Percent 

971 

67 

120,261 

67 

102 

7 

13,117 

7 

86 

6 

12,^^0 

7 

289 

20 

33,236 

19 

1,^^8 

100 

179,05^ 

IOC 

Grade 

11/Age  17 

Unveighted 

Weighted 

Count 

Percent 

Count 

Percent 

817 

59 

68,0^2 

59 

^8 

^,733 

4 

106 

8 

8,824 

8 

390 

29 

3J,563 

29 

1,361 

100 

115,162 

100 

*Two  Grade  4/Age  9  excluded  students  vere  not  retained  on  the  NAEP  database 
due  to  insufficient  data. 


91 


li'o 


Table  4(3) 


Comparison  of  Year 

15  Target  Assessments 

LO  AClUai 

Assessments 

by  Grade//ge 

Grade  4/Age  9 

Grade 

8/Age  13 

Grade  11/Age  17 

Target 

Actual 

Target 

Actual 

Target 

Actual 

Spiral  assessments 

24,267 

26,087 

24,267 

28,405 

24,267 

28,861 

Tape  assessments* 

5,000 

5,492 

5,000 

5,158 

5,000 

6,209 

'  Booklet  6A 

1,250 

1,403 

1,250 

1,310 

1,250 

1,539  ^ 

Booklet  65 

1,250 

1,356 

1,250 

1,276 

1,250 

1,540 

Booklet  66 

1,250 

1,389 

1,250 

1,283 

1,250 

1,596 

,  Booklet  67 

1,250 

1,344 

1,250 

1,289 

1,250 

1,534  j 

Total 

29,267 

31,579 

29,267 

33,563 

29,267 

35,070 

*  Tape  assessments  were  administered  to  age  only. 


92 


ERIC 


109 


Table  4(A) 


Comparison  of  Year  13  and  Year  15  Proportion  of  Excluded  Students, 

by  Grade/Age 


Year  13  Year  15 

Grade/ Age  Excluded  (%)*  Excluded  (%) 


Grade  4/Age  9  5.1  4.3 

Grade  8/Ace  13  5.2  A.l 

Grade  11/Age  17  3.5  3.7 


*  Year  13  assessment  was  administered  to  age  only. 


93 


liO 


Table  4(5) 


Comparison  of  Year  15  and  Year  13  Student  Participation  Rates, 
by  Type  of  PSU  and  Grade/Age 


Grade  4/Age  9 
PSU  Type  A* 
PSU  Type  B* 

Total 

Grade  8/Age  13 
PSU  Type  A* 
PSU  Type  B* 

Total 


Invited  to     Participa-     Year  13 
Assessed     Assessed       Assessment      tion  Rate  Participation 
(b)  (c=a+h)  (a/c)  Rate** 


(a) 


22,101 
9,478 


23,234 
10,329 


2,336 
694 


31,579  3,030 


3,563 
1,342 


33,563  4,905 


24,437 
10,172 

34,609 

26,797 
11,671 

38,468 


90.4 
93.2 

91.3 

86.7 
88.5 

87.3 


90.5 
90.5 

90.5 

85.0 
90.0 

85.5 


Grade  11/Age  17 

PSU  Type  A*         25,406         5,700  31,106  81.7 

PSU  Type  B*  9,664         1,592  11,256  85.9 


66.0 
82.0 


Total 


35,070  7,292 


42,362 


82.8 


74.2 


*  PSUs  Type  A  are  the  urban  PSUs  (SDOCs  1,  2  and  3);  PSUs  Type  B  are  the 
non-urban  PSUs  (SDOCs  4,  5  and  6). 

**Year  13  assessment  was  administered  to  age  only. 


94 

Hi 


assigned  to  the  school.    The  principal  English  or  language  aits  teacher  for 
each  of  these  sample  students  was  identified  by  the  school  and  was  asked  to 
complete  a  Teacher  Questionnaire.    Before  an  assessment  began,  all  students 
in  each  session  were  asked  to  code  their  principal  English  teacher  in  the 
box  provided  on  the  cover  of  the  exercise  booklet.    Thus,  it  was  possible 
to  associate  the  sample  of  teachers  with  assessed  students. 

The  conditional  probability  that  a  spiral  assessment  selected  student 
(ot  teacher  k)  had  his  or  her  teacher  in  the  survey  is  given  by 


( ;  ] 

where  the  ..-ymbol  [  J  j  denotes  the  number  of  combinations  of  a  things  taken 
b  at  a  time,  and 

n    =    the  total  number  of  students  invited  to  spiral  assessments; 

=    the  number  of  students  invited  to  spiral  assessments  whose 
teacher  is  the  k     teacher  of  the  school,  k=l,  2,    . . ,  K;  and 

t    =    the  number  of  spiral-invited  students  subsampled  for  the  teacher 
survey. 

If  any  English  teacher  was  identified  for  more  than  one  of  the 
subsampled  students,  the  teacher  completed  only  one  questionnaire.  Thus, 
the  number  of  completed  questionnaires  was  smaller  than  the  number  of 
students  subsampled  for  the  teacher  survey. 

Since  the  principal  teacher  was  recorded  only  for  assessed  students, 
P    was  approximated  by  replacing  n    and  n  by  the  numbers  of  assessed  rather 
than  invited  students.    Students  whose  teachers  were  surveyed  have  their 
weights  multiplied  by  the  reciprocal  of  P^  in  any  analyses  that  involve 
relating  teacher  characteristics  to  student  characteristics.    The  weights 
were  further  adjusted,  within  PSUs,  to  account  for  the  fact  that  not  all 
assessed  students  indicated  their  principal  language  arts  teacher  and  not 
all  sampled  teachers  returned  a  completed  questionnaire.    They  were  also 
adjusted  within  PSUs  by  a  post-stratification  procedure  so  that  the  sum  of 
the  weights  for  students  in  the  teacher  sample  were  equal  to  the  sum  of  the 

students  in  the  spiral  sample.    From  the  figures  shown  in 
Table  4(6),  because  of  either  nonresponse  or  overlap  we  lost  about  25 
percent  of  the  Grade  4/Age  9  sampled  teachers,  38  percent  of  the  Grade  8/ 
Age  13  sampled  teachers  and  35  percent  of  the  Grade  U/Age  17  sampled 
teachers,    in  addition,  about  2  percent  of  the  teachers  completing  a 
questionnaire  were  not  linked  to  an  assessment  booklet;  that  is,  the 
subsampled  student  through  whom  the  teacher  was  brought  into  the  sample  was 
either  absent,  excluded,  or  had  recorded  a  different  teacher  than  had  be«=n 
recorded  by  the  school  as  the  student's  principal  language  arts  teacher, 
and  no  other  tested  student  had  reported  that  teacher. 


95 


112 


Table  4(6) 

Distribution  of  Teachers  by  Grade/Age  and  Participation  Status 


Distinct 

Sampled  Teachers  Responding  Teacher  Linked 

Grade/Age          Teachers  Sampled  Teachers  Response  Rate  Teachers 

Grade  4/Age  9          1,361  1,066  1,025  96  1,004 

Grade  8/Age  13        1,275  821  790  06  779 

Grade  11/Age  17       1,406  980  915  93  901 

Total                  4,042  2,867  2,730  95  2,684 


96 

113 


Chapter  5 

THE  ASSIGNMENT  OP  EXERCISES  TO  STUDENTS^ 


Albert  E.  Beaton 
Eugene  G.  Johnson 
John  J.  Ferris 

Educational  Testing  Service 


The  purpose  of  the  National  Assessment  of  Educational  Progress  (NAEP) 
is  to  estimate  the  performance  in  particular  subject  areas  of  various 
subgroups  of  students,  at  specific  age  or  grade  levels.    In  the  past  as 
well  as  at  the  present,  NAEP  has  aimed  at  providing  information  on  a  broad 
spectrum  of  appropriate  and  important  skills  and  performances  in  the 
subject  areas  it  has  assessed.    This  information  has  been  and  continues  to 
be  provided  at  the  level  of  subgroups  of  the  population  rather  than  at  the 
level  of  the  individual  student. 

To  accomplish  this  purpose,  there  is  no  need  for  precise  measures  for 
any  individual  student.    Consequently,  it  is  not  necessary  or  even 
desirable  that  each  individual  student  take  the  entire  battery  of  exercises 
designated  for  the  student's  grade  or  age  level.    In  addressing  the  problem 
of  estimating  the  proportion  of  a  population  who  could  correctly  respond  to 
a  population  of  items  (given  a  fixed  number  of  item  responses).  Lord  (1962) 
has  shown  that  a  sample  with  many  persons  taking  just  one  item  each 
rasulted  in  an  estimator  with  a  smaller  standard  error  than  one  derived 
from  a  sample  in  which  fewer  persons  responded  to  many  items.    Since  such  a 
sampling  scheme  is  ordinarily  not  cost-effective  because  selecting 
individuals  is  expensive,  a  number  of  exercises  are  presented  to  each 
sampled  individual . 

Both  ETS  and  ECS  (the  previous  grantee  for  NAEP)  employed  multiple 
matrix  sampling  techniques  for  the  assignment  of  a  set  of  exercises  to 
subsamples  of  students.    The  matrix  sampling  approaches  in  both  cases 
enable  broad  coverage  of  a  given  subject  area  in  terms  of  the  total  number 
of  exercises  which  can  be  assessed  while  restricting  the  effort  required  of 
any  individual  student.    The  two  approaches,  however,  have  some  fundamental 
differences. 

The  ECS  multiple  matrix  sampling  design  divided  the  entire  pool  of 
exercises  designated  for  a  given  age  group  into  a  number  of  distinct  sets, 
called  packages,  each  of  which  would  take  a  student  about  three  quarters  of 


^The  tables  and  figures  for  this  chapter  were  produced  by  David  Freund. 


97 


an  hour  to  complete.    Using  this  approach,  the  six  hours  of  assessment 
exercises  allocated  to  an  age  group  would  result  in  eight  packages.  Since 
no  student  was  administered  more  than  one  package,  this  simple  matrix 
design  allowed  the  calculation  of  measures  of  relation  between  exercises 
within  the  same  package  but  not  between  exercises  in  different  packages. 

To  remedy  this  deficiency,  ETS  has  chosen  a  complex  variant  of  multiple 
matrix  sampling  called  Balanced  Incomplete  Block  (BIB)  spiralling.  This 
approach  continues  to  allow  the  broad  coverage  of  subject  areas  and  also 
allows  the  study  of  the  interrelationships  among  all  exercises  within  and 
between  subject  areas.    The  basic  idea  is  to  divide  up  the  total  assessment 
time  into  small  blocks.    Each  exercise  block  is  then  assigned  to  a  number 
of  assessment  booklets  such  that  each  block  of  exercises  is  paired  with 
each  other  block  in  some  booklet.    The  booklets  are  then  spiralled  so  that 
students  in  an  assessment  session  are  given  different  booklets.    Using  BIB 
spiralling,  a  large  number  of  booklets  must  be  created,  but  the 
interrelationships  between  objectives  may  be  examined  since  oach  exercise 
is  paired  with  each  other  exercise  in  some  booklet. 

The  BIB  spiralling  method  of  exercise  assignment  has  another  advantage 
over  the  previous  technique.    In  the  ECS  method  of  item  administration, 
after  the  sample  of  students  within  a  school  was  selected  and  brought  to  an 
assessment  session,  the  same  package  of  exercises  was  distributed  to  all 
students  within  that  session.    This  administration  of  the  same  exercises  to 
clusters  of  students  within  a  school  was  necessary  because  the 
administration  of  a  package  was  accompanied  by  a  paced  audiotape  of  the 
erercise  stimuli,  designed  to  minimize  the  effect  of  a  student's  reading 
ability  on  performance  in  other  subject  areas.    Unfortunately,  this 
administration  of  the  same  exercises  to  clusters  of  students  within  schools 
also  results  in  a  potential  increase  in  sampling  variability  over  a  simple 
random  sample  of  the  same  number  of  students  because  of  intra-cluster 
correlation.    In  contrast,  in  the  spiralled  mode  of  item  administration  a 
set  of  exercises  is  presented  to  fewer  persons  in  a  school  and  to  more 
schools.    This  results  in  a  marked  reduction  in  the  intra-school  cluster 
effect  over  the  package  administration  procedure  of  previous  assessments. 
Consequently,  since  the  sample  of  students  is  more  efficiently  utilized, 
the  required  sample  size  to  achieve  a  given  standard  error  is  reduced. 
Alternatively,  the  standard  error  for  a  given  sample  size  will  be  reduced. 

The  remainder  of  this  chapter  will  detail  the  spiralling  process  as 
implemented  for  the  NAEP  and  will  discuss  its  perceived  advantages  and 
disadvantages.    First,  however,  the  considerations  in  developing  the  design 
of  the  assessment  instruments  and  the  interplay  between  the  amount  of 
substantive  coverage  and  the  sample  size  will  be  discussed. 

5,1    Considerations  in  the  NAEP  Assessment  Design 

The  design  of  any  study  is  circumscribed  by  the  amount  of  funds 
available;  thus,  the  NAEP  staff  had  to  decide  how  to  allocate  its  resources 
to  allow  the  broadest  possible  assessment  of  its  Year  15  subject  areas. 


98 


ERLC 


11 


reading  and  writing.  The  decisions  that  resulted  in  the  final  design  were 
as  follows: 


(1)  Each  student  would  be  asked  to  participate  for  about  three 
quarters  of  an  hour.    To  have  a  national  assessment  at  all 
requires  the  cooperation  of  schools,  and  we  felt,  as  did  the 
ECS  staff  before  us,  that  limiting  the  intrusion  on  individual 
students  to  about  one  class  period  would  help  us  gain 
acceptance  in  the  schools.    The  design  originally  called  for 
46  minutes  of  assessment  for  each  student,  but  was  extended  to 
48  minutes  when  a  review  of  the  early  data  showed  that 
students  were  not  reaching  some  important  background  and 
attitude  questions. 

(2)  The  available  funds  were  sufficient  to  gather  data  on  about 
30,000  students  at  each  grade/age  level.    It  should  be  noted 
that,  under  the  terms  of  the  grant,  the  Research  Triangle 
Institute  provided  the  sample  of  schools.    Westat,  who  is  the 
ETS  subcontractor  for  sampling  and  field  administration, 
reviewed  the  sample  and  studied  some  preliminary  data 
collection  plans  to  estimate  the  number  of  students  who  could 
be  assessed  for  the  available  funds.    Thirty  thousand  students 
rt  48  minutes  per  student  resulted  in  an  expected  total  of 
24,000  hours  of  testing  time  at  each  grade/age  level. 

(3)  Each  exercise  would  be  responded  to  by  about  2,600  students  at 
each  grade/age  level.    In  past  assessments,  around  2,500  to 
2,600  students  at  each  age  level  were  targeted  for  each 
exercise.    We  felt  that  the  efficiencies  of  spiralling  would 
allow  us  to  reduce  the  number  of  students  from  about  2,500 
taking  each  exercise  (as  in  earlier  years)  to  about  2,000 
without  an  increase  in  sampling  error.    However,  we  were 
committed  to  sample  both  the  age  levels  which  were  sampled  in 
the  past  (ages  9,  13,  and  17)  and  the  grades  into  which  most 
of  those  youths  fell  (grades  4,  8,  and  11).    We  estimated  that 
a  sample  of  2,600  at  each  grade/age  level  would  result  in  a 
sample  of  about  2,000  at  each  age  and  also  about  2,000  at  each 
grade. 

(4)  Five  thousand  students  at  each  age  level  would  be  designated 
to  receive  audiotaped  assessment.    Data  has  been  collected  in 
national  assessments  since  1969,  and  we  did  not  want  to  lose 
continuity  with  the  data  already  collected.    Since  we  were 
making  a  change  from  audiotaped  administration  to  pencil- 
and-paper  administration,  we  felt  that  we  needed  to  determine 
what  effect  tl;e  method  of  administration  had  on  the 
performance  of  students  on  assessment  exercises.    Therefore,  a 
sample  of  5,000  students  was  designated  for  assessment  using 
the  same  procedures  as  in  the  past. 

(5)  There  would  be  six  minutes  of  questions  common  to  all 
students.    Some  questions,  such  as  racial/ethnic 


99 


ERIC 


IIG 


identification,  are  so  important  in  the  assessment  that  they 
must  be  asked  of  every  student.  At  first,  four  minutes  were 
allowed  for  such  questions,  but  early  experience  required  us 
to  increase  this  section  to  six  minutes. 

(6)  Assessment  exercises  and  other  background  and  attitude 
questions  would  be  grouped  into  blocks  which  would  require 
fourteen  minutes  to  complete.    These  blocks  would  contain  an 
average  of  twelve  minutes  of  reading  and  writing  exercises  and 
two  minutes  of  background  and  attitude  questions.    Thus,  each 
student's  48  minutes  would  include  the  common  questions  (six 
minutes)  and  three  blocks  of  other  assessment  questions 
(fourteen  minutes  each).    In  terms  of  content,  a  student  would 
spend  twelve  minutes  on  background  and  attitude  questions  and 
36  minutes  on  reading  or  writing  exercises. 

(7)  Several  longer  writing  exercises  could  not  be  administered  in 
the  twelve  minutes  allocated  in  each  block  and  were 
accommodated  by  creating  three  double-length  blocks  (28 
minutes). 

It  is  immediately  clear  that  a  perfectly  balanced  incomplete  block 
design  is  impossible,  since  the  double-length  blocks  can  not  be  paired 
within  the  48-minute  time  limit.    Although  we  could  not  assign  two 
double-length  blocks  to  any  student,  we  could  assign  them  in  such  a  way 
that  we  could  compare  the  double-length  blocks  indirectly  through  one  or  a 
chain  of  single-length  blocks,  and  we  did. 

The  final  sample  consisted  of  three  parts,  one  of  which  received 
BIB-spiralled  booklets,  a  second  received  partially  BIB-spiralled  (UBIB) 
booklets,  and  the  third  was  a  matrix  sample  which  was  assessed  using  paced 
audiotapes.    The  target  sample  sizes  and  the  amount  of  assessment  time  for 
the  different  samples  are  shown  in  Table  5(1). 


5.2    The  Balanced  Incomplete  Block  (BIB)  Spiral  Sample 

The  booklets  in  the  BIB  design  each  contain  the  common  block  and  three 
of  the  nineteen  single-length  blocks  assigned  to  this  sample.    The  nineteen 
blocks  were  assigned  to  booklets  using  a  cyclic  Youden  rectangle  (see 
Beall,  1971).    This  procedure  required  the  formation  and  printing  of  57 
different  booklets  and  assigned  each  individual  block  to  precisely  nine 
different  booklets.    Each  block  is  combined  with  each  other  block  exactly 
once  in  this  design,  and  thus  each  pair  of  exercises  was  assigned  to  some 
sample  of  youths.    (The  block  assignments  are  shown  in  the  left  half  of 
Table  5(2).) 

Block  designations  were  re-coded  using  a  permutation  mapping  of  the 
nineteen  letters  A  through  T  (except  I— there  is  no  block  I).    The  booklet 
numbers  were  then  re-coded  using  a  permutation  mapping  of  the  integers  1 
through  57.    Finally,  the  block  orders  were  randomly  permuted  within  each 


100 


117 


Table  5(1) 
Sample  Design  Sumiaary 


Sample 


BIB 


Age  and  Grade 
Age  Only 
Grade  Only 

Total 


-Assessment  Time  in  Minutes- 

 Blocks    Students  per   Subject 

Single    Double     Booklets      booklet    Block    Sample     Common    Matter    Other  Total 


19 


19 


0 

57 

156 

1,400 

8,867 

6 

228 

38 

272 

67 

600 

3,800 

j67 

600 

3,800 

0 

57 

290 

2,600 

16,467 

6 

228 

38 

272 

UBIB 


Age  and  Grade 
Age  Only 
Grade  Only 

Total 


TAPE  (Age  Only) 


4* 

4* 
12 


3 

0 


700 
300 
300 


1 , 400  4 , 200 
600  1 , 800 
600  1,800 


6  1,300  2,600  7,800 
4         1,250       1,250  5,000 


6  120 

6  120* 
6  144 


20 


20 


146* 


24  174 


TOTAL-EACH  GRADE/AGE  21  3 
TOTAL-ALL  GRADE/AGES    63  9 


67 
201 


29,267  6      324  54  384 

87,801  ** 


*  Two  single  blocks  are  duplicated  in  BIB  sample 
**  Total  assessment  time  depends  on  common  blocks  across  grade/age 

101 


erJc 


118 


Table  5(2) 


Booklet  Design— BIB  Spiral  Sample 
(19  X  3  X  57  Cyclic  Youden  Rectangle) 


Original  Design 

Permuted  Design 

Item  Block 

Item  Block 

Booklet 

1 

2 

3 

Booklet 

1 

2 

3 

i 
1 

A 

B 

G 

X 

T 

G 

L 

2 

B 

C 

H 

L 

A 

L 

P 

3 

C 

D 

J 

J 

D 

A 

T 

4 

D 

E 

K 

4 

C 

S 

E 

c 

E 

F 

L 

c 
J 

C 

A 

H 

0 

F 

G 

M 

c 
0 

G 

F 

H 

/ 

G 

H 

N 

7 

K 

R 

N 

Q 
O 

H 

J 

0 

o 

R 

M 

F 

o 

J 

K 

P 

Q 

0 

N 

L 

K 

L 

0 

1  n 

F 

D 

B 

11 

L 

M 

R 

1 1 

E 

M 

A 

1  o 

Iz 

M 

N 

S 

S 

H 

B 

13 

N 

0 

T 

M 

K 

D 

14 

0 

P 

A 

1  /. 

T 

N 

J 

1 J 

P 

Q 

B 

1 J 

M 

T 

C 

Q 

R 

C 

Ifi 

C 

L 

Q 

17 

R 

S 

D 

17 

H 

E 

R 

18 

S 

T 

E 

18 

C 

P 

F 

19 

T 

A 

F 

19 

L 

S 

K 

20 

A 

C 

L 

20 

N 

B 

E 

21 

B 

D 

M 

21 

N 

C 

D 

22 

C 

E 

N 

22 

Q 

K 

H 

23 

D 

F 

0 

23 

L 

H 

D 

24 

E 

G 

P 

24 

A 

S 

R 

25 

F 

H 

Q 

25 

L 

J 

R 

26 

G 

J 

R 

26 

T 

F 

Q 

27 

H 

K 

S 

27 

C 

K 

J 

28 

J 

L 

T 

28 

0 

J 

S 

29 

K 

M 

A 

29 

Q 

0 

D 

30 

L 

N 

B 

30 

B 

Q 

J 

31 

M 

0 

C 

31 

0 

T 

H 

32 

N 

P 

D 

32 

B 

M 

L 

33 

0 

Q 

E 

33 

C 

R 

0 

102 

ERIC 


Table  5(2) 
(continued) 

Booklet  Design~BIB  Spiral  Sample 
(19  X  3  X  57  Cyclic  Youden  Rectangle) 


Orii 

?inal  Design 

Permuted  De 

sien 

Item  B3 

Item  Block 

Booklet 

1 

2 

3 

1 

2 

3 

34 

P 

R 

p 

G 

0 

E 

35 

0 

S 

S 

0 

M 

36 

R 

T 

H 

JU 

B 

A 

0 

37 

S 

A 

1 

J  / 

K 

G 

A 

38 

T 

B 

V 

In 

JO 

0 

F 

K 

39 

A 

D 

P 

S 

T 

40 

n 
ij 

El 

J 

E 

F 

L 

41 

C 

F 

K 

41 

H 

M 

J 

42 

D 

G 

L 

42 

J 

E 

D 

43 

E 

H 

M 

43 

F 

J 

A 

44 

F 

J 

N 

44 

B 

G 

C 

45 

G 

K 

0 

45 

P 

B 

K 

46 

H 

L 

P 

46 

S 

F 

N 

47 

J 

M 

Q 

47 

P 

0 

E 

48 

K 

N 

R 

48 

B 

R 

T 

49 

L 

0 

S 

49 

P 

M 

0 

50 

M 

P 

T 

50 

R 

P 

D 

51 

N 

0 

A 

51 

G 

R 

Q 

52 

0 

R 

B 

52 

S 

G 

D 

53 

P 

S 

C 

53 

H 

P 

N 

54 

0 

T 

D 

54 

T 

E 

K 

55 

R 

A 

E 

55 

M 

G 

N 

56 

S 

B 

F 

56 

A 

N 

0 

57 

T 

C 

G 

57 

G 

J 

P 

103 


booklet.    The  final  design  is  shown  in  the  right  half  of  Table  5(2).  The 
booklets  in  which  each  block  appeared  are  listed  in  Table  5(3) • 

As  shown  in  Table  5(1),  this  design  called  for  each  booklet  to  be 
administered  to  288.9  different  students  and,  since  each  block  was  in  nine 
booklets,  each  block  was  therefore  to  be  given  to  about  2,600  students,  our 
target,  at  each  grade/age  combination.    Altogether,  this  part  of  the  design 
called  for  288.9  students  to  take  one  of  57  booklets  and  thus  16,A67 
students  in  all.    Looking  at  the  age  and  grade  samples  'separately,  we 
expected  each  booklet  to  be  administered  to  222.2  youths  at  each  age  or 
grade  level,  thus  each  block  to  be  administered  to  2,000  youths,  resulting 
in  a  total  age  or  grade  sample  of  about  12,667. 


5.3    The  Unbalanced  Incomplete  Block  (UBIB)  Spiral  Sample 

The  booklets  in  the  unbalanced  design  eaoli  contain  the  comnion  block,  a 
single-length  block,  and  a  double-length  block.    This  design  used  seven 
blocks:     three  double-length  blocks,  two  "new"  single-length  blocks  that 
were  not  used  in  the  completely  balanced  design,  and  two  "old"  blocks  that 
were  also  used  in  the  other  design.    This  design  resulted  in  the  formation 
and  printing  of  six  booklets.    Two  of  the  double-length  blocks  were 
combined  with  one  of  the  new  and  one  of  the  old  blocks;  the  other 
double-length  block  was  paired  with  both  of  the  nev  blocks.    The  assignment 
of  blocks  to  booklets  is  shown  in  Table  5(4). 

The  design  called  for  each  of  these  booklets  to  be  administered  to 
1,300  youths  and,  since  each  of  the  new  blocks  was  in  exactly  two  booklets, 
each  block  was  also  administered  to  2,600  youths.    Altogether,  the  design 
called  for  7,800  students  to  take  a  UBIB  booklet.    The  design  also  met  the 
objective  of  having  about  2,000  students  take  each  exercise  if  we  observed 
the  sample  for  a  particular  age,  or  2,000  students  if  we  observed  a 
particular  grade. 

The  two  booklets  which  contain  a  double-length  block  and  one  of  the 
single-length  blocks  from  the  completely  balanced  sample  result  in  an 
oversampling  of  these  two  single-length  blocks  since  they  are  already 
adequately  sampled  in  the  BIB  design.    These  tw^  single-length  blocks  occur 
in  nine  BIB  booklets,  each  of  whi:h  is  administered  to  about  289  students, 
and  in  one  UBIB  booklet,  which  \s  administered  to  1,300  youths;  thus,  the 
targeted  sample  for  each  of  thes^i  blocks  was  3,900. 


5.4    Overall  Pairings  of  Item  Blocks  in  the  BIB/UBIB  Design 

The  number  of  pairings  o£  iter?  blocks  for  all  BIB  and  UBIB  blocks  are 
shown  in  Table  5(5).  Because  block  Q  replaced  block  Y  for  Grade  4/Age  9, 
the  pairings  for  that  gradeAage  sample  are.  slightly  different. 


121 


Table  5(3) 

Spiral  Sample 
Block-to-Booklet  Correspondence 


Block 

Booklet 

Numbers 

A 

2 

3 

5 

11 

2A 

36 

37 

H  J 

B 

10 

12 

20 

jO 

32 

36 

44 

C 

5 

15 

16 

5 — 
18 

21 

27 

LL 

D 

3 

10 

13 

21 

23 

29 

A2 

SO 

E 

it 

11 

17 

3A 

40 

42 

Dh 

F 

10 

ID 

26 

^8 

40 

HO 

G 

1 

6 

34 

51 

52 

H 

5 

6 

12 

1  / 

11 

23 

31 

ii\ 

J 

25 

27 

2d 

30 

41 

42 

.JO 

K 

1 

13 

19 

22 

27 

37 

36 

L 

1 

2 

16 

19 

23 

25 

32 

<0 



M 

8 

11 

13 

15 

3i 

35 

41 

49 

55 

N 

7 

14 

20 

21 

46 

53 

55 

56 

0 

^ 

26 

29 

31 

33 

54 

36 

36 

49 

P 

2 

Id 

3$ 

45 

47 

49 

50 

53 

57 

Q 

16 

22 

26 

29 

30 

35 

47 

51 

56 

59*  63* 

R 

7 

d 

17 

24 

25 

33 

46 

50 

51 

60 

S 

li 

12 

19 

24 

28 

35 

39 

46 

52 

T 

1 

3 

U 

15 

26 

31 

39 

46 

54 

U 

56 

59 

V 

60 

61 

W 

62 

63 

X 

61 

65 

Y 

5$** 

63** 

Grade  A/Age  9  only 

Grade  8/Age  13  and  Grade  11/Age  17  only 


105 


Table  5(A) 


Booklet  Design 
UBIB  Spiral  Sample 


Long 

Short 

Booklet 

Block 

Block 

58 

U 

J 

59* 

U 

Y 

60 

V 

R 

61 

V 

X 

62 

W 

X 

63* 

W 

Y 

*  In  Grade  A/Age  9,  Block  Q  was  substituted  for 
Block  Y  in  Booklets  59  and  63 


106 


123 


Table  5(5) 


Number  of  Pairings  of  Item  Blocks  in  Spiral  Design 
(Number  of  Block  Occurrences  on  the  Diagonal) 


Grade  4/Age  9 


ABCDEFGHJKLHNOPQRSTUVWX 


A 
B 
C 
D 
E 
F 
G 
H 
J 
K 
L 
M 
N 
0 
P 
Q 
R 
S 
T 
U 
V 
W 
X 


9  1111 
9  111 
9    1  1 
9  1 
9 


10 


10 


1  1 
1 


2  1 
2  1 
2 


107 


ERIC 


Table  5(5) 
(continued) 

Number  of  Pairings  of  Item  Blocks  in  Spiral  Design 
(Number  of  Block.  Occurrences  on  the  Diagonal) 


Grade  8/Age  13  and  Grade  11/Age  17 


ABCDEFGHJKLMNOPQRSTUVWXY 


A 

B 

C 

D 

E 

P 

G 

H 

J 

K 

L 

M 

H 

0 

P 

Q 

R 

S 

T 

U 

V 

W 

X 

Y 


9  1 

9 


1 
1 
1 
1 

Q 


10 


10 


1 

2  1 

2 


108 


125 


5.5  Spiralling 

The  method  for  spiralling  booklets  was  designed  for  two  purposes: 

(1)  To  achieve  a  ratio  of  nine  students  taking  a  UBIB  booklet  to 
two  students  taking  a  BIB  booklet  in  order  to  meet  the 
targeted  sample  sizes  in  each  category;  and, 

(2)  To  distribute  the  booklets  across  the  sample  of  students  so 
that  the  booklets  within  a  category  (BIB  or  UBIB)  would  be 
administered  in  equal  numbers  and  without  positional  bias. 

The  first  purpose  was  accomplished  by  forming  a  cycle  of  168  booklets 
consisting  of  two  sets  of  BIB  booklets  (1-57)  and  nine  sets  of  UBIB 
booklets  (58-63).    The  BIB  and  UBIB  booklets  were  merged  as  follows: 

1  2  58    3    4  59    5    6  60    7    8  61    9  10  62  11 

12  63  13  14  58  15  16  59  17  18  19  60  20  21  61  22  23  62  24 

25  63  26  27  58  28  29  59  30  31  60  32  33  61  34  35  62  36  37 

38  63  39  40  58  41  42  59  43  44  60  45  46  61  47  48  62  49  50 

63  51  52  58  53  54  59  55  56  57  60    1    2  61    3    4  62    5  6 

63    7    8  58  9  10  59  11  12  60  13  14  61  15  16  62  17  18  19 

63  20  21  58  22  23  59  24  25  60  26  27  61  28  29  62  30  31  63 

32  33  58  34  35  59  36  37  38  60  39  40  61  41  42  62  43  44  63 

45  46  58  47  48  59  49  50  60  51  52  61  53  54  62  55  56  57  63 

A  given  BIB  booklet,  say  #1,  appears  two  times  in  this  cycle;  a  given 
UBIB  booklet,  say  #58,  appears  nine  times.    Administering  this  cycle  of 
booklets  evenly  across  the  sample  of  students  establishes  the  ratio  of  nine 
UBIB  booklets  to  two  BIB  booklets. 

In  a  complete  cycle  of  168  booklets,  each  of  the  six  UBIB  booklets  will 
have  appeared  nine  times  and  each  of  the  57  BIB  booklets  will  have  appeared 
two  times.    As  a  result  of  this  spiralling,  each  of  the  24  blocks  of  items 
used  in  BIB  and  UBIB  booklets  will  appear  the  same  number  of  times  in  a 
complete  cycle  (except  for  blocks  J  and  R,  which  are  used  in  both  BIB  and 
UBIB  booklets  at  all  three  grade/age  levels,  and  block  Q,  which  was  used  in 
place  of  block  Y  for  the  Grade  4/Age  9  UBIB  booklets). 

Each  block,  except  for  blocks  J,  R,  and  Q,  appears  exactly  eighteen 
times  in  the  168-booklet  cycle.    Blocks  J  and  R  appear  27  times.    Block  Q 
appears  36  cimes  in  the  Grade  4/Age  9  spiralling  cycle.    Block  Y  appears 
zero  times  in  the  Grade  4/Age  9  spiralling  cycle. 

The  second  purpose  was  accomplished  by  collecting  this  cycle  of  168 
booklets  into  bundles  of  23  consecutive  booklets,  with  a  subsequent  bundle 
beginning  where  the  previous  bundle  left  off;  the  last  of  the  168  booklets 
was  always  followed  by  the  first  in  a  continuous  circling  process  (hence 
the  term  "spiralling").    As  a  result,  168  different  bundles  were  created 
and  each  booklet  distributed  evenly  throughout  23  positions  in  the  bundles. 
By  shipping  consecutive  bundles  to  schools,  the  likelihood  that  any  given 
booklet  would  be  used  was  equalized  across  the  sample. 


109 


5.6    The  Tape  Sample 


Four  assessment  booklets  were  designed  for  the  tape  sample.    Each  was 
to  be  administered  to  a  subsample  of  1,250.    Each  booklet  contained  the  six 
minute  common  block  and  42  minutes  of  cognitive  exercises  and  background 
and  attitude  items.    Since  a  tape  recorder  was  used  in  administration,  all 
students  in  an  assessment  session  were  assigned  the  same  booklet. 


5.7    Achieved  Samples 

The  results  of  the  implementation  of  the  entire  design  are  shown  in 
Tables  5(6)  and  5(7)  and  Figures  5-1  and  5-2.    Table  5(6)  presents  the 
number  of  students  assessed  by  each  booklet  by  grade/age.    The  same 
information  is  graphically  depicted  in  Figure  5-1. 

The  number  of  students  responding  to  each  BIB  and  UBIB  block  appears  in 
Table  5(7)  and  is  graphically  depicted  in  Figure  5-2. 


5.8    Advantages  and  Disadvantages  of  the  Spiral  Design 

A  larg3,  complex  assessment  design  such  as  that  used  in  the  Year  15 
NAEP  has  a  number  of  advantages  and  disadvantages,  which  should  be 
mentioned. 

5.8.1    Interrelationships  Among  Exercises 

The  purpose  of  the  BIB  spiral  design  was  to  allow  the  examination  of 
the  interrelationships  of  a  large  number  of  exercises,  and  it  does.  The 
final  sample  includes  nineteen  14-minute  blocks,  266  minutes  in  all,  of 
exercises,  and  for  any  pair  of  exercises  in  these  blocks  there  is  a  sample 
of  youths  who  was  presented  both  exercises.    Thus,  correlations  can  be 
computed  among  all  the  exercises  in  this  part  of  the  sample.    The  remaining 
112  minutes  of  exercises  are  organized  so  that  some,  but  not  all,  of  the 
correlations  can  be  calculated. 

This  design  is  in  contrast  to  the  multiple  matrix  design  which  was  used 
previously.    Given  a  fixed  sample  size,  simple  matrix  sampling  and  BIB 
spiralling  would  administer  any  particular  exercise  to  the  same  number  of 
youths,  but,  by  creating  more  booklets,  the  spiral  design  would  pair  the 
exercises  in  a  block  with  many  different  blocks  of  exercises,  thus 
increasing  the  number  of  comparisons  that  could  be  made.  Consequently, 
many  correlations  are  possible,  most  of  them  based  on  a  fairly  small, ^ 
though  well-selected,  sample.    In  the  design  as  implemented,  correlations 
within  a  block  are  based  on  about  2,000  students  for  an  age  or  grade 
separately;  correlations  between  blocks  are  based  on  about  222  students  for 
an' age  or  grade  separately.    In  contrast,  the  simple  matrix  sampling  used 


110 

127 

ERLC 


Table  5(6) 


Number  of  Booklets  Administered 
Spiral  and  Tape  Samples 


Booklet 
Number 

rirflHp  A/ 

vl L  Civic    H / 

Acrp  Q 

uraae  o/ 

Arr<a    1  1 

Age  i  J 

Grade  11/ 

A            1  1 

Age  1  / 

1 

346 

2 

«>  1 1 

363 

3 

0  C  Q 

4 

3^4 

5 

0  0  o 

339 

6 

0/1 

341 

7 

J  J  J 

o  o  c 

335 

8 

0  0  0 

333 

9 

J  JO 

0  "  ^ 

10 

'^/l  /• 

0  /  T 

34J 

11 

0  O  T 

3z/ 

12 

0  0  T 

33  / 

13 

o  o  / 

324 

14 

0  /  r\ 
34U 

15 

'^/. 

0  0  0 

333 

16 

0  0  T 

33  / 

17 

jjO 

0  / 

34U 

18 

0  0  o 

33o 

19 

5S7 

0  /  r\ 
34U 

20 

302 

0  / 

34U 

21 

312 

55fi 

0  /.  n 

22 

51 5 

0  /  T 

34  / 

23 

305 

59fl 

3  j4 

24 

307 

332 

338 

25 

317 

328 

336 

26 

312 

325 

344 

27 

315 

327 

350 

28 

329 

322 

347 

29 

319 

328 

349 

30 

317 

314 

350 

31 

307 

324 

338 

32 

316 

332 

345 

33 

306 

331 

344 

34 

296 

328 

344 

35 

302 

335 

340 

36 

311 

336 

344 

111 

o 

ERIC 


Table  5(6) 
(continued) 


Number  of  Booklets  Administered 
Spiral  and  Tape  Samples 


oooiCie  L 

clUc   O  / 

fIraHp  11/ 

Vji.  clue  XX/ 

Age  y 

Aero  1 

Acre  1  7 

"KOI 

3A5 

335 

9Q1 

352 

3A3 

A1 

->VJO 

3A6 

AO 

3AS 

35Q 

A'X 

9Q9 

3AA 

353 

A  A 

JVJ  J 

3AQ 

AS 

359 

3A5 

AA 

^  1  ^ 

3AA 

3A3 

A7 

'^1  A 

339 

338 

AR 

330 

3A1 

AQ 

317 

325 

348 

50 

309 

338 

351 

51 

315 

323 

349 

52 

322 

338 

350 

53 

316 

349 

356 

54 

312 

350 

344 

55 

316 

338 

347 

56 

313 

349 

329 

57 

317 

339 

346 

58 

1422 

1516 

1572 

59 

1396 

1527 

1537 

60 

1416 

1529 

1574 

61 

1395 

1513 

1528 

62 

1385 

1520 

1528 

63 

1405 

1502 

1543 

Tape  Booklets 

64 

1403 

1310 

1539 

65 

1356 

1276 

1540 

66 

1389 

1283 

1596 

67 

1344 

1289 

1534 

Total 

31579 

33563 

35070 

Total  Spiral  26087 

28405 

28861 

Total  Tape 

5492 

5158 

6204 

112 


ERIC 


129 


Figure  5-1 


BIB  Spiral  Sample 
Number  of  Students  per  Block 


IICE  3  /  CRfiCE  4  ilOIRL  H  --  26.0671 


ERIC 


113 


Table  5(7) 


Number  of  Blocks  Administered: 
Spiral  and  Tape  Samples 

Grade  4/Age  9        Grade  3/A)?e  13        Grade  11/Age  17 
Block  Total  Total  Total 


Spiral  Sample 


A 

n 

2/71 

3075 

3098 

D 

2795 

3042 

3093 

2776 

3052 

3084 

n 

2790 

3053 

3124 

17 

2741 

3069 

3089 

V 

2744 

3072 

3082 

3030 

3122 

2795 

3046 

3100 

r 

4213 

K 

2800 

3089 

3080 

L 

2778 

3075 

M 

2792 

3057 

3045 

N 

2810 

3076 

3066 

0 

2806 

2974 

3095 

p 

2803 

3075 

31.20 

5611 

2982 

3083 

4211 

4524 

4641 

s 

2815 

3063 

3084 

T 

2788 

3060 

3076 

U 

2818 

3043 

3109 

V 

2811 

3042 

3102 

V 

2790 

3022 

3071 

X 

2780 

3033 

3056 

3029 

3080 

Total  Spiral 

26087 

28405 

28861 

Tape  Sample 

P64 

1403 

1310 

1539 

P65 

1356 

1276 

1540 

P66 

1389 

1283 

1596 

P67 

1344 

1289 

1534 

^ Block  J  appeared  in  both  BIB  and  UBIB 

^Block  Q  was  substituted  for  Block  Y  in  books  59  and  63  for  Grade  4/Age  9 
^ Block  R  appeared  in  both  BIB  and  UBIB  booklets 
''Block  Y  was  not  administered  at  Grade  4/Age  9 


131 


Figure  5-2 


BIB  Spiral,  UBIB  Spiral  and  Tape  Samples 
Number  of  Students  per  Booklet 


flCE  9  /  GRRDE  <  lIOIflL  N  =  31.579) 


BOOKLET  NUflBER 


%l  13  /  GRADE  8  (iOlfll  H  =  33.553) 

1.  800  1 


BOOKLET  NUHBER 


m  17  /  GRRDE  11  iroiflL  H  =  35.070i 

1.800  ^ 


115 


previously  allowed  the  larger  number  of  exercises  within  a  package  to  br 
correlated  (based  on  around  2,500  students)  but  did  not  allow  any 
calculation  of  correlations  among  exercises  in  different  packages. 


5,8.2    The  Cost  of  Complexity 

Clearly,  spiralling  is  expensive  in  printing  costs  as  well  as  in  the 
costs  of  design  talent  and  detail  management.    Including  the  multiple 
matrix  sampling  that  was  done  for  this  NAEP,  67  booklets  were  created  for 
each  of  the  three  grade/ages  assessed;  thus,  there  were  201  booklets 
created  in  all.    It  was  expensive  to  produce  many  booklets  in  small 
volumes.    It  was  tedious  to  manage  a  task  in  which  every  detail  had  to  be 
multiply  checked.    Another  substantial  cost  was  incurred  by  the  creation  of 
an  intelligent  data  entry  system,  since  developing  a  way  to  read  the 
booklets  by  machine  was  impossible,  given  available  time  and  resources. 

Spiralling  had,  however,  reduced  costs  in  some  ways.    The  system  was 
robust  against  failures  in  the  field,  since  a  serious  biasing  of  results  by 
having  the  exercise  administrators  use  the  wrong  bundle  of  booklets  was 
most  unlikely  and  would  have  very  little  effect  on  the  design.^  The  absence 
of  the  tape  recorder  reduced  costs  in  both  preparation  and  administration 
of  the  assessment.    Most  importantly,  as  noted  in  Section  5.8.4,  the  spiral 
design  reduced  the  number  of  students  needed  to  achieve  a  fixed  standard 
error,  thus  allowing  us  to  assess  more  exercises. 


5.8.3    Tape-Recorded  Administration 

Losing  the  ability  to  administer  assessments  by  tape  recorder  was  not 
something  that  the  NAEP  staff  wanted,  but  came  about  because  of  the  spiral 
design.    It  is  clear  that,  when  each  student  in  an  assessment  session  is 
taking  a  different  booklet,  the  administration  cannot  be  presented  with  a 
single  tape  recorder.      We  did  not  consider  tape  recorders  with  headphones 
for  use  by  individual  students. 

The  advantage  of  tape-recorded  administration  is  that  it  allows  the 
separation  of  reading  ability  from  the  subject  area  being  assessed.    In  a 
reading  assessment,  the  instructions  are  tape  recorded  and  the  progress 
through  the  assessment  is  paced,  although  the  reading  exercises  themselves 
are,  of  course,  not  read.    In  other  subject  areas,  the  exercises  are  read 
aloud  so  that  students  can  respond  to  an  exercise  even  though  they  may  not 
be  able  to  read  it.    This  is  clearly  a  desirable  feature.  Additionally, 
the  pacing  feature  of  tape-recorded  administration  tends  to  ensure  that 
each  student  is  exposed  to  each  exercise.    This  is  also  a  desirable 
feature. 

And  yet,  the  utility  of  the  NAEP  is  greatly  enhanced  by  developing 
exercises  that  teachers  or  local  or  state  personnel  can  readily  administer 
to  their  students,  the  results  of  which  can  then  be  compared  to  the  NAEP 
sample.    Teachers  are  not  likely  to  simulate  the  tape  recording;  thus,  any 
comparisons  would  be  suspect.    We  know  of  no  local  or  state  assessments 


116 


133 


that  currently  use  tape-recorded  administrations  (although  some  states  and 
districts  in  the  early  days  of  NAEP  replicated  all  administration 
procedures  including  taped  administrations).    Thus,  the  tape  recorder  hrd 
the  effect  of  setting  the  NAEP  results  apart  from  all  other  student 
assessments* 


5. 8,4    Sampling  Efficiency 

One  advantage  of  the  spiral  design  is  that  it  presents  a  particular 
block  of  exercises  to  fever  persons  in  a  school,  but  to  more  schools.  In 
this  way,  the  cluster  effect  is  markedly  reduced;  thus,  the  students  are 
used  more  efficiently.    Given  reasonable  assumptions,  it  has  been  estimated 
that  the  required  sample  size  to  achieve  a  given  standard  error  is  reduce<^ 
by  about  20  to  25  percent  by  BIB  spiralling,  as  compared  to  multiple  matrix 
sampling;  altsrnatively ,  the  standard  errors  could  be  reduced  by  about  10 
to  15  percent  if  the  sample  size  were  kept  constant  (Hansen,  Tepping,  Lago, 
&  Burke,  198A).    Analyses  of  the  design  eff*-ct5  from  the  Year  15  NAEP 
(discussed  in  Chapter  14.2)  show  that  this  reduction  in  variability  has, 
indeed,  taken  place. 


5.8.5    Statistical  Issues 

As  mentioned  above,  spiralling  does  not  result  in  a  complete, 
rectangular  data  matrix  that  can  be  analyzed  using  standard  statistical 
systems  nor  does  it  generate  data  which  are  consistent  with  normal 
statistical  methods.    The  techniques  used  to  analyze  such  a  dataset  are 
discussed  in  subsequent  chapters. 

The  exercise  assignment  procedures  produced  a  total  of  67  different 
samples  of  the  population  of  students  of  a  particular  grade/age,  one  for 
each  of  the  63  BIB/UBIB  spiral  booklets  and  one  for  each  of  the  four  tape 
booklets.    Although  each  of  these  samples  involved  different  students,  they 
are,  in  a  particular  sense,  equivalent  to  each  other.    Because  they  were 
selected  by  probability  sampling  techniques  (described  in  Chapter  4),  the 
complete  set  of  students  of  a  given  grade/age  who  were  selected  for 
assessment  are  a  representative  probability  sample  of  the  population  of 
students  of  that  grade/age  designation.    The  procedure  for  designating 
whether  a  given  student  was  to  be  assessed  in  a  spiral  session  or  in  a  tape 
session  and,  if  a  tape  session,  which  of  the  four  booklets  was  to  be  used, 
was  also  done  in  a  (controlled)  random  manner;  the  procedure  (given  in 
Chapter  4)  ensured  that  every  student  could  have  been  selected  for  any  one 
of  the  four  tape  sessions  or  for  a  spiral  session.    This  random  assignment 
was  controlled  (by  systematic  selection)  to  ensure  that  each  of  the  five 
samples  (the  four  tape  samples  and  the  combined  spiral  sample)  was 
representative  of  the  population,  in  particular  controlling  for  all  of  the 
stratification  variables  (region,  size  and  description  of  community)  as 
well  as  the  size  of  the  school. 

The  (larger)  spiral  sample  was  further  divided  into  63  subsamples  by 
the  BIB/UBIB  spiral  technique  described  previously.    As  in  the  case  of  the 


117 


assignment  of  type  of  session  to  student,  this  division  was  also  done  in  a 
systematic  (but  random)  manner,  to  ensure  that  every  student  who  was 
selected  for  a  spiral  session  could  have  received  any  one  of  the  63 
booklets «    This  random  assignment  was  done  within  sessions  within  each 
school  and  so  is  more  likely  to  result  in  srimples  of  students  which  closely 
match  each  other  in  terms  of  their  demographic  characteristics* 

The  63  samples  corresponding  to  the  spiral  booklets  and  the  foui 
samples  corresponding  to  the  tape  booklets  given  at  a  particular  grade/age 
are  each  representative  samples  of  their  target  population  of  all  students 
in  the  grade/age*    Since  any  assessed  student  could  have  been  placed  in  any 
one  of  these  samples ,  and  because  of  the  balance  that  is  enforced  by  the 
method  of  sampling,  each  of  these  samples  can  be  deemed  equivalent,  in  a 
sense,  to  each  other.    We  will  call  them  randomly  equivalent.    Because  of 
the  closer  match  between  the  various  samples  that  is  possible  with 
spiralling,  the  equivalence  between  the  spiral  samples  is  closer  than  is 
the  equivalence  beiween  the  tape  samples. 


118 


135 


Chapter  6 
INSTRUHENT  AND  ITEM  INFORMATION^ 


Janet  R.  Johnson 
Educational  Testing  Service 


The  Year  15  assessment  incorporated  four  distinct  types  of  instruments: 
student  assessment  booklets,  a  questionnaire  for  excluded  students,  a 
teacher  questionnaire,  and  a  school  characteristics  and  policy 
questionnaire.    The  data  collected  from  these  instruments  are  available  on 
the  public-use  data  tapes.    This  chapter  begins  with  a  discussion  of  how 
cognitive  and  non-cognitive  items  were  organized  into  blocks  to  create  the 
student  assessment  booklets.    Sections  6.1.2  through  6.1.4  provide  an 
overview  of  the  items.    The  last  three  sections  describe  the 
questionnaires. 


6.1    Student  Assessment  Instruments 

Student  assessment  booklets  were  composed  of  items  that  were  either 
cognitive  or  non-cognitive.    Cognitive  items  were  reading  exercises,  study 
skill  exercises  or  writing  exercises.    Non-cognitive  items  asked  questions 
relative  to  the  background  and  attitudes  of  students.    Some  non-cognitive 
items  were  presented  to  every  student  and  were  placed  together  in  a  block 
called  the  common  block  or  common  core.    Others  were  placed  at  the 
beginning  of  the  blocks  containing  the  cognitive  items.     Later  sections  of 
this  chapter  provide  greater  detail  about  both  the  cognitive  and 
non-cognitive  items. 

Based  upon  the  Balanced  Incomplete  Block  (BIB)  and  Unbalanced 
Incomplete  Block  (UBIB)  sampling  design  (described  in  Chapter  5),  cognitive 
and  non-cognitive  items  were  grouped  into  blocks.    Twenty--f our  blocks  ol 
items  were  used  to  create  a  total  of  63  spiral  assessment  booklets  and  four 
tape-administered  booklets  for  each  grade/age.      Tables  6(1)  and  6(2)  show 
the  blocks  contained  in  each  booklet  used  for  each  grade/age. 


Some  of  the  tables  for  this  chapter  were  generated  by  David  Freund  and 
Alfred  Rogers;  details  regarding  block  assemblage  were  provided  by  Kalle 
Gerritz;  and  the  taxonomy  provided  in  Table  A(2)  of  Appendix  A  was  created 
by  Gita  Valuer. 

^Tape-administered  booklets  were  used  in  group  administrations  to 
"pace"  students  through  booklets  with  audio  recordings.    The  instructions 
were  read  by  an  announcer;  reading  passages,  items,  and  response  choices 


119 


l3o 


Table  6(1) 

Booklet  Contents  by  Block  for  Grade  4/Age  9 


Booklet  Block  Booklet  Block  Booklet  Block 

1  TGL  26  TFQ  51  GRO 

2  ALP  27  CKG  52  SGD 

3  DAT  28  OJS  53  HPN 

4  CSE  29  QOD  54  TEK 

5  CAH  30  BQJ  55  MGN 


6 
7 
8 

9 

10 


GFH 
KRN 
RMF 
ONL 
FDB 


31 
32 
33 
34 
35 


OTH 
BML 
CRO 
GOE 
SQM 


56 
57 
58 
59 
60 


ANQ 
GJP 
U*J 
U*Q 
V*R 


11  EMA 

12  SHB 

13  MKD 

14  TNJ 

15  MTC 


36  BAG 

37  KG  A 

38  OFK 

39  PST 
^0  EFL 


61  V*X 

62  W*X 

63  W*Q 

64  tape 

65  tape 


16 
17 
18 
19 
20 


CLQ 
HER 
CPF 
LSK 
NBE 


41 
42 
43 
44 
45 


HMJ 
JED 
FJA 
BGC 
PBK 


66 
67 


tape 
tape 


21  NCD 

22  QKH 

23  LHD 

24  A3R 

25  LJR 


46  SFN 

47  PQE 

48  BRT 

49  PMO 

50  RPD 


*double-length  block 


ERIC 


120 


137 


Table  6(2) 

Booklet  Contents  by  Block  for  Grade  8/Age  13  and  Grade  11/Age  17 


ERIC 


Booklet  Block  Booklet  BlocK  Booklet  Block 

1  TGL  26  TFQ  51  GRQ 

2  ALP  27  CKJ  52  SGD 

3  DAT  28  OJS  53  HPN 
*            CSE                     29  QOD                  54  TEK 


5  CAH  30  BQJ  55 


7  KRN  32  BML  57 

8  RMF  33  CRO  58 

9  ONI.  34  GOE  59 
10  FOB  35  SQM  60 


14  TNJ  39  EST  64 

15  MTC  40  EFL  65 


16  CLQ  41  HMJ  66 

17  HER  42  JED  67 

18  CPE  43  FJA 

19  LSK  44  BGC 

20  NBE  45  PBK 


21  NCD  46  SFN 

22  QKH  47  PQE 

23  LHD  48  BRT 

24  ASR  49  PMO 

25  LJR  50  RPD 

*double-length  block 


121 


13g 


MGN 


6  GFH  31  OTH  56  ANQ 

GJP 


V*R 


11  EMA  36  BAG  61  V*X 

12  SHB  37  KGA  62  W*X 

13  MKD  38  OFK  63  W*Y 

tape 
tape 


tape 
tape 


For  Grade  4/Age  9,  20  single-  and  3  double-length  blocks  were  used  to 
create  booklets.    For  Grade  8/Age  13  and  Grade  11/Age  17,  21  single-  and  3 
double-length  blocks  were  used  to  create  booklets.    Tables  6(3),  6(4),  and 
6(5)  show  the  contents  of  the  blocks  and  the  booklets  in  which  they  were 
placed.    Each  single-length  block  contained  fourteen  minutes  of  assessment 
items.  Approximately,  the  first  two  minutes  were  devoted  to  background  and 
attitude  items  while  the  remainder  of  the  fourteen  minutes  contained 
cognitive  items.    The  double-length  blocks  were  similarly  arranged  but 
allowed  28  minutes  total  assessment  time,  the  majority  of  which  was  to  be 
devoted  by  the  student  to  responding/  to  the  cognitive  items.    It  is 
important  to  remember  that  while  the  content  of  some  blocks  was  identical 
for  more  than  one  grade/age,  and  sometimes  identical  for  all  three 
grade/ages,  this  was  not  true  in  every  instance.    For  example,  the 
cognitive  items  contained  in  Block  X  for  Grade  4/Age  9  are  entirely 
different  from  those  contained  in  Block  X  for  Grade  8/Age  13  and  Grade 
11/Age  17.    As  illustrated  by  the  tables,  and  described  belov,  different 
blocks  contained  different  types  of  items. 

Blocks  A  through  G  were  writing  blocks,  which  contained  writing-related 
non-cognitive  items  followed  by  writing  exercises.    Blocks  H  through  R  were 
reading  blocks,  which  contained  both  general  and  reading-related 
non-cognitive  items  followed  by  reading  exercises.    The  number  of  reading 
or  writing  exercises  within  a  block,  listed  in  Tables  6(3)  through  6(5) 
under  the  heading  "Cog.  Items",  varied  from  one  block  to  another. 

Some  items  that  had  been  considered  reading  items  in  the  broader 
definition  used  by  the  learning  area  committees  of  earlier  assessments  were 
re-classified  as  "study  skill"  items  for  Year  15.    An  item  was  classified 
as  a  study  skill  item  if  it  required  some  specially  learned  skill  above  and 
beyond  the  facility  of  recognizing  and  understanding  the  printed  word.  For 
example,  these  items  included  those  whose  stimulus  was  a  bar  graph,  a 
telephone  bill  or  a  table  of  contents.    Study  skill  items  were  concentrated 
in  blocks  S  and  T;  some  study  skill  items  also  appeaiod  in  the  four  tape 
booklets.    They  were  excluded  from  the  group  of  items  »jsed  in  the  IRT 
analysis  (see  Chapter  10.3)  because  they  were  believed  to  be  representative 
of  a  different  dimension. 

Blocks  U,  V,  and  W  contained  a  combination  of  writing  and  reading  items 
and  were  28  minutes  long.    Block  X  was  fourteen  minutes  long  and  contained 
both  reading  and  writing  items.    Block  Y,  which  was  not  administered  to 
Grade  4/Age  9,  was  fourteen  minutes  long  and  contained  reading  items. 
Blocks  X  and  Y  were  used  exclusively  in  combination  with  the  28-minute 
blocks. 

Tables  6(3)  through  6(5)  also  provide  the  total  number  of  each  type  of 
item — background,  writing  or  reading — for  each  age.    As  can  be  seen,  the 
item  pool  varied  in  number  of  items  from  one  grade/age  to  another. 


were  read  by  the  student.    The  taped  administrations  were  vsed  in  previous 
NAEP  assessments  and  were  used  again  in  Year  15  to  explore  the  effects  of 
the  change  from  audiotaped  recordings  to  pencil -and-paper  instruments. 


122 


Table  6(3) 
Assessment  Items  for  Grade  A/Age  9 


Wri  ting 

Block.     Type     Bg.  Items  Items 


Common 
A 


B 


_C_ 
D 


G_ 
H 


K 


M 


III 


1-37 


Wr. 


1-12 


13 


Wr. 


1-15 


16 


Vr. 
Wr. 


1-22 


23 


1-24 


25 


Wr. 


1-9 


Wr. 


1-5 


10,11 


Wr. 


6,7 


1-6 


7,8 


Rdg. 
Rdg. 


1-4 
1-11 


Rdg. 
Rdg. 


1-8 
1-19 


1-4 
1-11 


No.  No. 

Reading     Total  Cog. 
Items       Items  Items 


Booklets  Containing  Block 


37 
13 


1-63 


16 
23 


J  3_ 

10  12 


5    11    24    36    37    43  56 


25 


_4  5_ 

3  10 


20    30    32    36    44    45  48 


15    16    18    21    27    33  44 


11 
7 


13    21    23    29    42    50  52 


4    11    17    20    34    40    42    47  54 


5-15 


15 


11 


8    10    18    26    38    40    43  46 


6    34    37    44    51    52    55  57 


12-24 


24 


13 


6    12    17    22    23    31    41  53 


9-19 


19 


11 


14    25    27    28    30    41    42    43  57~~5r 


20-26 
5-16 


26 


7    13    19    22    27    37    38    45  54 


16 


12 


9    16    19    23    25    32  40 


8    11    13    15    32    35    41    49  55 


N_ 

0 


Rdg. 
Rdg. 


12-25 
12-22 


25 
22 


14 


7      9    14    20    21    46    53    55  56 


Q 


S 


_T_ 
U 


Rdg. 
Rdg. 


1-6 


1-9 


Rdg. 


1-4 


St.Sk.  1-18 


St.Sk. 
Comb. 


1-18 


1-17 
1-28 


18 
36 


11 


7-19 


19 


9    28    29    31    33    34    36    38  49 


10-21 


21 


5-16 


16 


19-33 


19-35 


33 
35 


13         2    m    39    45    47    49    50  "53"~5r 
12       16    22    26    29    30    35    47    51    56    59  6T 
12      _7      8^7~  24    25    33    48    50  51~6Q  


]5 
17 


4    12    l^j    24    28    35    39    46  52 


19-27 
29-35 


27 
36 


10 


_J  3_ 

58  59 


U    15    26    31    39    48  54 


_W_ 
X 


Comb. 


Comb.  1-15 


1-36  39.43 


16 


37-38,40-42 


8 


43 


17-20 


60  61 
62  63 


20 


61  62 


Total  Cognitive 


15 


173 


188 


*Item  35  is  a  three-part  reading  item 


123 


erIc  1''^ 


I4i 


Table  6(4) 
Assessment  Items  for  Grade  8/Age  13 


No.  No. 

Writing        Reading       Total  Cog. 
Block     Type     Bg.  Items    Items  Items         Items      Items  Booklets  Containing  Block 


Common 

D  re 

Bg. 

1  11 
i-j  / 

•^7 

1 

1- 

'0  J 

A 

wr . 

1  19 

1  7. 

1 
1 

9 
z 

J 

1 1 

9A 
z*t 

JO 

■^7 
J  / 

A*^ 
*♦  J 

n 
D 

wr . 

1  1 

10 

1  A 
10 

1 
1 

1  n 

lU 

1  9 
IZ 

90 
ZVJ 

•^0 

•^9 
JZ 

JO 

AA 

AS 

n 

wr . 

1— zz 

Z  J 

9'^ 
Z  J 

1 
1 

L 

*♦ 

J 

1  s 

1 J 

1  (\ 

10 

1  R 
10 

91 
Z  1 

97 
z  / 

J  J 

n 

u 

w  r . 

1— Z*+ 

z^ 

1 

1. 

10 

1  '\ 

91 

9*^ 

9Q 

A9 

SO 

I? 

wr . 

1  Q 

1  —  7 

in  11 

IVJ  y  1 1 



1  1 
1 1 

Z 

1  1 
1 1 

1  7 
1  / 

90 
ZVJ 

'^A 
J*+ 

AO 

A9 
*+Z 

LI 

SA 

V 

r 

w  r . 

1  S 
1— J 

A  7 

7 

9 
z 

o 

o 
o 

10 

1 R 
1,0 

96 

JO 

AO 

A*^ 

H  J 

Afi 

wr . 

1  fx 

1  —  0 

o 
0 

9 
Z 

1 

0 

'\L 
j*+ 

J  / 

^1 

S9 
jz 

ss 

^  1 

n 

1  S 
1.— J 

O  — 1.0 

1 

\j 

17 

99 

9'^ 
^  J 

■^1 
J  X 

41 

J 

J 

Rdcr 

1-10 

11-24 

24 

14 

14 

25 

11 

28 

30 

41 

42 

43 

57 

58 

K 

Rdg. 

1-8 

9-17 

17 

9 

7 

13 

19 

22 

27 

37 

38 

45 

54 

L 

Rdg. 

1-21 

22-27 

27 

6 

1 

2 

9 

16 

19 

23 

25 

32 

40 

M 

Rdg. 

1-4 

5-16 

16 

12 

8 

11 

13 

15 

32 

35 

41 

49 

55 

N 

Rdg. 

1-11 

12-23 

23 

12 

7 

9 

14 

20 

21 

46 

53 

55 

56 

0* 

Rdg. 

1-11 

12-21 

21 

10 

9 

28 

29 

31 

33 

34 

36 

38 

49 

P 

Rdg. 

1-6 

7-15 

15 

9 

2 

18 

39 

45 

47 

49 

50 

53 

57 

Q 

Rdg. 

1-6 

7-23 

23 

17 

16 

22 

26 

29 

30 

35 

47 

51 

56 

R 

Rdg. 

1-/. 

5-19 

19 

15 

7 

8 

17 

24 

25 

33 

48 

50 

51 

60 

S 

St.Sk. 

1-18 

19-37 

37 

19 

4 

12 

19 

24 

28 

35 

39 

46 

52 

T 

St.Sk. 

1-18 

19-38 

38 

20 

1 

3 

14 

15 

26 

31 

39 

48 

54 

U 

Comb. 

1-17 

18 

19-31 

31 

14 

58 

59 

V 

Comb. 

1-28 

32 

29-31 

32 

4 

60 

61 

w 

Comb. 

1-36 

37,42 

38-41 

42 

6 

62 

63 

X 

Comb. 

1-15 

16 

17-24 

24 

9 

61 

62 

Y 

Comb. 

1-3 

4-10 

10 

7 

59 

63 

Total  Cognitive  15  ]91  206 


*Item  15  is  a  two-part  reading  item 


124 


Table  6(5) 
Assessment  Items  for  Grade  U/Age  17 


No.  No. 
Wr  i  t  i  ng  Read  i  ng       To  t  a 1       Cog . 

Block     Type     Bg.  Items    Items  Items         Items     Items  Booklets  Containing  Block 


Common     Bg . 

1  -Afl 

J.  HO 

*#o 

1 

1  - 

■D  J 

A 

wr . 

1  -1  9 

1  '\ 

1 

1 

9 
Z 

o 
J 

c 

J 

1  1 

ZA 

JO 

4i 

JO 

D 
D 

wr . 

1-1  s 

J.  —  J.  -J 

1 

1  A 
1  0 

1 

1 

1  9 
IZ 

zu 

5\) 

JZ 

JO 

44 

4j 

4o 

n 

wr . 

1 

/, 

c: 

1  £i 

1  Q 

lo 

0  1 

z  1 

Z  / 

0  0 

44 

n 

wr . 

1 

O 
J 

1  (\ 

0  1 

zl 

Z  J 

zy 

4z 

jU 

JZ 

p 

wr . 

1-9 

10  11 

1  1 
1 1 

9 

L 

1  1 
1 1 

1  7 

zu 

Az 

7.7 

A/ 

j4 

V 
V 

wr . 

1-5 

6  7 

7 

9 
Z 

0 

Q 
O 

1 

1  Q 

9  A 
ZO 

JO 

AU 

4j 

4d 

n 

wr . 

1-6 

7  a 

o 
o 

9 
Z 

1 
1 

0 

7  A 

J  / 

^  1 

^9 

DZ 

JJ 

j/ 

H 

Rdg. 

1-6 

7-1  Q 

1  Q 

1  '\ 

s 

0 

1  9 
1 Z 

1  7 

9  9 
ZZ 

9  '5 
Z  J 

'5 1 

/.  1 
Ai 

J 

Rdg. 

1-11 

1  ?-1  7 

1  7 

A 
u 

1  A 

9S 
Z-J 

97 
Z  / 

9fl 
ZO 

jU 

A1 

A9 
*»Z 

A  7 
Aj 

^^7 

JO 

K 

Rdff. 

1-8 

9-17 

17 

9 

7 

13 

19 

22 

27 

37 

38 

45 

54 

L 

Rdg. 

1-26 

27-32 

32 

6 

1 

2 

9 

16 

19 

23 

25 

32 

40 

M 

Rdg. 

1-4 

5-16 

16 

12 

8 

11 

13 

15 

32 

35 

41 

49 

55 

N 

Rdg. 

1-20 

21-32 

32 

12 

7 

9 

14 

20 

21 

46 

53 

55 

56 

0 

Rdg. 

1-11 

12-24 

24 

13 

9 

28 

29 

31 

33 

3^ 

36 

38 

49 

P 

Rdg. 

1-U 

15-25 

25 

11 

2 

18 

39 

45 

47 

49 

50 

53 

57 

Q 

Rdg. 

i-6 

7-17 

17 

11 

16 

22 

26 

29 

30 

35 

47 

51 

56 

R 

Rdg. 

1-11 

12-20 

20 

9 

7 

8 

17 

24 

25 

33 

48 

50 

51 

60 

S 

St.Sk, 

1-18 

19-37 

37 

19 

4 

12 

19 

24 

28 

35 

39 

46 

52 

T 

St.Sk. 

1-18 

19-38 

38 

20 

1 

3 

14 

15 

26 

31 

39 

48 

54 

U 

Conb . 

1-lV 

18 

19-31 

31 

14 

58 

59 

V 

Comb. 

1-37 

41 

38-40 

41 

4 

60 

61 

w 

Comb. 

1-38 

39,44 

40-43 

44 

6 

62 

63 

X 

Comb. 

1-15 

16 

17-24 

24 

9 

61 

62 

Y 

Comb. 

1-5 

6-12 

12 

7 

59 

63 

Total 

Cognitive 

15 

176 

191 

125 


6.1.1    Assembling  Reading  and  Writing  Items  into  Blocks 


The  following  considerations  were  taken  into  account  during  the  process 
of  assembling  the  blocks: 

(1)  Because  of  the  order  of  assessment  administration,  blocks  for 
Grade  8/Age  13  wer.j  developed  first,  then  those  for  Grade 
4/Age  9,  and  finally  those  for  Grade  11/Age  17.  Ideally, 
blocks  for  all  three  grade/ages  should  have  been  developed 
together. 

(2)  An  item  was  selected  to  be  placed  within  a  specific  block 
based  on  the  time  required  to  complete  the  item. 

(3)  For  Grade  11/Age  17,  some  blocks  were  repeated  intact  from 
the  blocks  assembled  for  Grade  8/Age  13. 

(4)  In  general,  an  attempt  was  made  to  start  blocks  with  easy 
items  and  progress  to  difficult  ones.    This  was  not  always 
possible. 

(5)  When  a  reading  item  required  a  lengthy  writter.  response,  the 
item  was  always  placed  at  the  end  of  a  block. 

(6)  Whenever  possible,  reading  items  were  physically  arranged  so 
that  the  reading  passage  and  the  items  appeared  on  the  same 
or  facing  pages.    This  was  not  possible  when  the  stimulus 
material  was  lengthy. 

(7)  Any  item  that  had  been  revised  and  was,  therefore,  different 
from  its  earlier  form  as  used  an  previous  assessments  was 
considerec'  io  be  a  new  item. 

(8)  The  tapes  contained  only  items  that  had  been  used  in  past 
asse.s^menis.    Items  were  fit  into  the  tape  blocks  based  on 
the  aiming  of  the  items  as  taken  from  nhe  tape  scripts. 


6.1.2    Reading  Items 

The  reading  items  included  short  and  long  reading  passages,  graphically 
presented  materials,  poems,  and  reference  materials  (e.g.,  tables  oiE 
contents).     Some  items  required  a  multiple-choice  response,  some  open-ended 
items  required  a  brief  written  response,  and  some  required  written  essays. 
These  latter  items,  of  which  there  vis  a  total  of  twelve  across  the  three 
grade/ages,  were  professionally  scored.     (The  protessional  scoring  process 
is  discussed  in  Chapter  8.2.) 

Some  of  these  items  had  been  developed  for  the  earliest  reading 
assessment  and  re-used  in  some  or  all  of  the  subsequent  assessments;  some 
items  had  been  developed  exclusively  for  the  1983--84  assessment;  and 


126 


ERIC 


146 


some  items  had  been  developed  and  used  only  once  over  the  years, 
(Development  of  the  reading  objectives  and  items  is  discussed  in  Chapter 
3.)    In  addition,  some  items  had  remained  unchanged  in  wording  and 
arrangement  while  others  had  undergone  a  variety  of  alterations.    Each  item 
was  carefully  researched  as  to  its  use  and  possible  alteration  over  time. 
This  process  became  important  specifically  for  those  items  included  in  the 
tape  booklets  for  use  in  the  Year  15  trend  analysis,    (See  Table  B(l)  in 
Appendix  B  for  a  list  of  the  items  initially  considered  for  nse  in  the 
trend  analysis. ) 

Tables  6(6)  through  6(8)  examine  the  tapes  for  each  age.    These  tables 
show  which  items  (by  item  location  number)  from  each  spiral  block  were  used 
in  the  assembly  of  the  tape  booklets.    For  Age  9,  Tape  2  contains  one 
reading  item  that  does  not  appear  within  any  of  the  spiral  blocks.    For  Aje 
17,  there  are  20  such  reading  items  across  all  four  tapes. 

Table  A(l)  in  Appendix  A  provides  a  complete  descriptive  list  of  all 
Year  15  reading  items  with  their  corresponding  block  or  tape  numbers  and 
item  numbers.    As  can  be  seen  from  this  table,  the  number  of  items 
presented  to  each  age  varied.    Some  items  overlapped  all  three  grade/ages, 
some  overlapped  two  grade/ages,  and  some  were  particular  to  a  grade/age.  A 
total  of  176  reading  items  was  presented  to  Grade  4/Age  9;  a  total  of  192 
reading  items  was  presented  to  Grade  8/Age  13;  and  a  total  of  196  reading 
items  was  presented  to  Grade  11/Age  17.    Complete  item  text  is  available  on 
the  microfiche  that  accompanies  the  public-use  data  tapes. 


6.1.3    Writing  Items 

Writing  items  appeared  in  spiral  blocks  A  through  G  and  U  through  X  and 
one  or  two  of  the  tapes  depending  upon  the  age  group.    Table  6(9)  presents 
the  pool  of  writing  items  and  their  block  or  tape  locations  by  age. 

From  a  total  pool  of  22  writing  items,  15  were  used  for  each  grade/age. 
Some  of  these  items  had  been  used  in  one  or  both  of  the  previous  writing 
assessments.    By  design,  students  who  received  one  or  more  writing  blocks 
could  be  asked  to  respond  to  as  few  as  one  writing  item  or  as  many  as  four. 

The  writing  items  were  developed  to  assess  performance  in  three  writing 
areas:  informative^  persuasive  and  imaginative.    Students  were  asked  to 
write,  for  example,  letters,  descriptive  essays,  or  narrative  pieces.  (Th£ 
development  of  writing  objectives  and  items  is  discussed  in  Chapter  3; 
professional  scoring  of  the  writing  responses  is  discussed  in  Chapter  8.2.) 

Four  of  the  writing  items,  Dali,  Aunt  May,  Split  Session  and  Hole  in 
the  Box,  appeared  in  the  tape  booklets  to  be  used  in  determining  writing 
trends.    For  more  information  concerning  writing  trends  see  Writing;  Trends 
Across  the  Decade,  1974-84  (Applebee,  Langer,  &  Mullis,  1986a). 


127 


Table  6(6) 

Cognitive  Items  from  Spira3  Blocks  on  Age  9  Tapes 


ABCDEFGHJK 


M  N 


Q      K    S4  T-l- 


Tape 

1 


Tape 

2 


Tape 
3 


Tape 
4 


148 


ERIC 


13* 


21 


1  7 

10 

9  19 

19 

24  30 

1ft 

11 

20 

20 

25 

1  Q 

?6 

20 

27 

22 

27 

21 

28 

23 

22 

29 

26 

28 

29 

30 

31 

32 

33 

34 

35 

6  9    22  13 

7  10 

8  11 


15  10 
11 
12 
13 


15  10 


16>- 


5  12  16 

10  13  17 

11  14 

12  15 

13  16 

14  17 
15 


10 
11 
12 


16 


13  11 

14  1? 
16 


21 

22 
23 
24 
25 


19  18 

20  19 


12 
i3 
14 
17 
18 
19 


30 


18*    29  37 
38 


Total  Items 
on  Tape 

36 


18 

Or.e  item  on 
tape  does  not 
appear  within 
the  blocks 


29 


18 


■  Study  Skills  Items 
*  Writing  Items 


T4^ 


128 


Table  6(7) 

Cognitive  Items  from  Spiral  Blocks  on  Age  13  Tapes 


Tape 
1 


Tape 

2 


Tape 

3 


Tape 
4 


A    B    C    D    E    F    G    H    J  V 


17  9 

18  10 

19  11 
14 
15 

 16 

7  11 

8  12 
12  13 

14 
15 
16 


23 


P  Q 

R  S+ 

T| 

1  0 

7 

22 

19 

19 

13 

8 

30 

20 

20 

14 

9 

31 

21 

21 

15 

32 

22 

22 

19 

33 

23 

23 

20 

34 

26 

24 

35 

28 

36 

29 

37 

30 

31 

U     V  w 


18 


12 
13 
14 
15 


11 
12 
13 


16  16 


10 
11 


19  32 

20  33 

21  34 
35 


20  12 

21  13 


12 
13 
14 
17 
18 
19 


7  16 

8  17 

9  18 


27  18* 


■I-  Study  Skills  Items 
*  Writing  Items 


16* 


4 
5 
6 
7 
8 
9 
10 


17 
18 
19 
20 
21 
22 
23 
24 


Total  Items 
on  Tape 


34 


17 


30 


26 


129 


?■  I. 


ERIC 


loo 


Table  6(8) 

Cognitive  Items  from  Spiral  Blocks  on  Age  17  Tapes 


ABCDEFGHJK 


M  N 


R    SI  T-t- 


Tape 
1 


Tape 

2 


21 

22 

19 

19 

22 

30 

20 

20 

23 

21 

21 

21 

24 

32 

22 

22 

28 

33 

23 

23 

29 

34 

26 

24 

35 

28 

36 

29 

37 

30 

31 

13* 


Tape 
3 


Tape 
4 


12  9 

13  10 

14  11 
14 
15 
16 


27 


27 


7 
8 
9 


13 


11  25 

12 

13 


10 


19  32 

20  33 

21  34 
35 


12 
13 


12 
13 
14 
21 
22 


11 
12 
13 


27  18* 


.  152 


Study  Skills  Items 
*  Writing  Items 


16* 


6 
7 
8 
9 

10 
11 
12 


17 
18 
19 
20 
21 
22 
23 
24 


Total  Items 

 on  Tape 

34 

Three  items  on 
tape  do  not 
appear  within 
the  blocks 


17 

One  item  on 
tape  does  not 
appear  within 
the  blocks 


JO 

Ten  items  on 
tape  do  not 
appear  within 
the  blocks 


26 

Six  items  on 
tape  do  not 
appear  within 
the  blocks 


:5T 


erJc 


130 


Table  6(9) 
Year  15  Writing  Items 


Item 


Block  and  Tape  Locations 
Age  9  Age  13         Age  17 


N000102 


N000202 

SCHOOL  RULE 

B 

B 

B 

N000302 

RECREATION  OFf. 

C 

C 

K000402 

FOOD  ON  FRONTIER 

D 

D 

D 

N000502 

DISSECTING  FROGS 

E 

N000602 

XYZ  COMPANY 

E 

E 

N000702 

SWIMMING  POOL 

F 

F 

F 

N000802 

PETS 

F 

F 

N000902 

RADIO  STATION 

G 

G 

N001002 

APPLEBY  HOUSE 

G 

G 

G 

N007202 

HOLE  IN  THE  BOX 

U.Tape  4 

U,Tape  4 

U.Tape  4 

N007602 

FLASHLIGHT 

V 

V 

V 

N007702 

GHOST  STORY 

w 

W 

W 

N007902 

FAVORITE  MUSIC 

w 

W 

w 

N008002 

SPLIT  SESSION 

X.Tape  2 

X.Tape  2 

N014702 

PLANTS 

c 

N014802 

SPACESHIP 

E 

N014902 

AUNT  MAY 

X.Tape  2 

NO 18002 

SPACE  PROGRM 

E 

N019002 

JOB  APPLICATION 

E 

N020002 

UNCLE 

F 

N021002 

BIKE  LANE 

G 

131 


Tables  6(6)  through  6(8)  examine  the  tapes  for  each  age.    These  tables 
show  which  items  (by  item  location  number)  from  each  spiral  block  were  used 
in  the  assembly  of  the  tape  booklets.    Writing  items  are  indicated  by 
asterisks. 

Complete  writing  item  text  is  available  on  the  microfiche  that 
accompanies  the  public-use  data  tapes. 


6.1.4    Non-Cognitive  Iterrj 

For  the  Year  15  assessment,  each  spiral  and  tape  booklet  included  six 
minutes  of  background  and  attitude  items  common  to  all  students.  These 
items  are  general  questions  concerning  materials  in  the  home,  parental 
education,  etc.    Additional  background  and  attitude  items  were  spiralled 
throughout  BIB  and  UBIB  booklets  (see  Chapter  5).    These  attitude  items 
related  to  objectives  formulated  for  reading  and  writing.    The  items 
measured  students'  perceptions  of  their  teachers'  instructional  practices 
in  reading  and  writing;  their  own  study  habits  and  reading  activities; 
their  perceptions  of  the  value  of  reading  and  writing;  and  their  assessment 
of  themselves  as  readers  and  writers. 

Table  A(2)  in  Appendix  A  lists  descriptors  of  all  of  the  background  and 
attitude  items  grouped  by  the  topics  they  were  designed  to  address.  The 
series  of  a  letter  and  numbers  preceding  each  descriptor  is  the  unique  NAEP 
ID  assigned  to  that  particular  item.    If  the  ID  begins  with  "B",  the  item 
was  included  in  the  common  block  of  items  administered  to  all  students.  If 
the  ID  begins  with  "S",  the  item  appeared  at  the  beginning  of  a  single-  or 
double-length  block. 

Table  A(3)  in  Appendix  A  lists  descriptors  of  all  background  and 
attitude  items  in  NAEP  ID  order  with  block  location  and  item  number  within 
block  for  each  grade/age.    The  common  block  (CB)  items  are  listed  first, 
followed  by  items  which  appear  in  the  single-  and  double-length  blocks  (A 
through  X).    Grade  11/Age  17  students  were  presented  an  additional  number 
of  items,  many  of  which  were  curricula-specific.    Complete  item  text  is 
available  on  the  microfiche  that  accompanies  the  public-use  data  tapes. 

In  addition  to  the  common  core  items,  the  Year  15  tap^i  booklets 
contained  additional  non-cognitive  items,  which  appeared  as  a  separate 
section  at  the  end  of  the  booklets.    These  items  were  drawn  from  the  pool 
of  items  appearing  in  the  spiral  booklets. 


6.2    The  Excluded  Student  Questionnaire 

The  Excluded  Student  Questionnaire  was  developed  and  used  for  the  first 
time  in  the  Year  15  assessment.  It  was  designed  to  gather  more  information 
about  particular  conditions  for  exclusion  and  characteristics  of  the 
learning  experience  of  excluded  students • 


132 


The  questionnaire  was  completed  by  school  personnel  for  every  studeht 
who  was  selected  for  inclusion  in  the  NAEP  sample  but  was  unable  to  respond 
to  items  because  he  or  she  was  judger'  by  school  personnel  to  be  non-English 
speaking,  educable  mentally  retarded  or  functionally  disabled.  The 
four-pagf-  questionnaire  was  used  to  gather  information  concerning  special 
education,  language,  and  other  student  programs.    A  copy  of  the  Excluded 
Student  Questionnaire  is  available  on  the  microfiche  that  accompanies  the 
public-use  data  tapes. 

Of  the  104,437  students  sampled  for  the  Year  15  assessment,  4,225  were 
ineligible  or  excluded  by  the  school  due  to  classification  as  educable 
mentally  retarded,  non-English  speaking,  or  functionally  disabled.  There 
were  1,416  (4.3  percent)  excluded  students  in  Grade  4/Age  9,  1,448  (4.1 
percent)  in  Grade  8/Age  13  and  1,361  (3.7  percent)  in  Grade  11/Age  17. 


6.3    The  Teacher  Questionnaire 

The  Teacher  Questionnaire  was  developed  and  used  for  the  first  time  in 
Year  15.    It  was  designed  to  gather  information  on  the  curricula  and 
teaching  methods  used  by  selected  English  and  Language  Arts  teachers.  The 
data  were  provided  by  teachers  who  completed  a  nine-page  questionnaire 
which  included  questions  concerning  years  of  teaching  experience,  frequency 
of  writing  assignments,  teaching  materials  used,  the  availability  and  use 
of  computers,  and  perceptions  of  the  school  and  its  curricula. 

To  sample  teachers  for  the  teacher  questionnaire  we  associated  the 
student's  main  language  arts  or  English  teacher  with  each  student 
participating  in  the  spiral  assessment.    Ve  requested  that  the  student's 
main  English  teacher  be  identified  in  the  background  information  sheet. 
The  sample  of  teachers  was  then  drawn  by  selecting  one  student  from  each  of 
the  sessions  being  conducted  at  the  school.    Each  teacher  sampled  received 
only  one  questionnaire  even  though  he  or  she  may  have  been  associated  with 
more  than  one  of  the  students  subsampled  for  this  purpose.    Further  detail 
on  the  sampling  of  teachers  is  provided  in  Section  2.4  of  Chapter  2. 

Responses  were  received  from  a  total  of  1,027  fourth  grade  teachers, 
790  eighth  grade  teachers  and  915  eleventh  grade  teachers. 

Three  versions  of  the  Teacher  Questionnaire  were  developed— one  for 
each  grade/age.    A  copy  of  each  Teacher  Questionnaire  is  available  on  the 
microfiche  that  accompanies  the  public-use  data  tapes. 


6.4    The  School  Characteristics  and  Policy  Questionnaire 

A  School  Characteristics  and  Policy  Questionnaire  was  distributed  to 
each  participating  school  to  be  completed  by  either  the  school's  principal 
or  another  person  familiar  with  data  concerning  enrollment,  facilities, 
curricula  and  staff  development. 


133 


15u 


The  five-page  questionnaire  was  developed  for  two  purposes:     to  collect 
school  data  proven  by  research  studies  to  be  related  to  student 
performance;  and  to  collect  school  data  for  use  by  educational  policymakers 
both  to  monitor  implementation  of  existing  policies  and  to  identify  new 
policy  issues. 

The  questionnaire  items  were  grouped  according  to  eight  categories; 
principal,  students,  staff,  standards,  program,  computers,  school  climate, 
and  school  finance. 

Responses  were  received  from  663  fourth-grade  schools,   ,86  eighth-grade 
schools,  and  331  eleventh-grade  schools.    Cooperation  rates  were  88.6 
percent,  90.3  percent,  and  83.9  percent  for  f oul th-grade,  eighth-grade,  and 
eleventh-grade  schools,  respectively;  the  overall  cooperation  rite  was 
88.1  percent. 

Because  no  eligible  students  were  selected  in  several  schools  that 
submitted  responses,  the  number  of  schools  for  which  data  are  retained  in 
the  NAEP  database  is  less  than  the  number  of  schools  from  which  responses 
were  received.    The  NAEP  database  contains  data  for  661  fourth-grade 
schools,  478  eighth-grade  schools,  and  326  eleventh-grade  schools. 

A  copy  of  the  School  Characteristics  and  Policy  Questionnaire  is 
available  on  the  microfiche  that  accompanies  the  public-use  data  tapes. 


13A 


ERiC  157 


Chapter  7 
FIELD  ADMINISTRATION 


Renee  Slobasky 
Nancy  Caldwell 

Westat ,  Inc. 


As  a  subcontractor  to  ETS,  Westat,  Inc.  was  responsible  for  field 
activities  leading  to  and  including  administration  of  the  assessment 
sessions  and  delivery  of  completed  assessment  booklets  to  ETS.  (Westat, 
Inc.  was  also  responsible  for  sample  design  and  implementation,  discussed 
in  Chapter  4.)    This  chapter  describes  the  Westat  field  organization  and 
operations  for  the  Year  15  assessment.    Details  of  field  administration 
activities  are  available  in  the  Westat  Report  on  Field  Operations  and  Data 
Collection  Activities  -  NAEP  Year  15  (1984)-   

The  Year  15  assessment  focused  on  the  learning  areas  of  reading  and 
writing*    For  this  assessment,  over  1,600  schools  were  sampled  and  invited 
to  cooperate*    Of  this  number,  1,465  schools  actually  participated.  Within 
these  schools,  a  sample  of  114,075  students  was  selected  to  be  assessed. 


7*1    Schedule  of  Year  15  Field  Activities 

The  Year  15  pre-assessment  and  assessment  field  activities  were 
conducted  from  May  1983  to  May  1984.    The  period  from  May  to  September  1983 
was  devoted  to  the  pre-assessment  activities  of  establishing  the  field 
force  and  developing  all  materials  and  procedures  to  be  used  during  the 
assessments.    Pre-assessment  activities  are  described  in  Section  7.?. 

In  early  October  1983,  the  assessment  began.    Thirteen-year-olds  and 
eighth  graders  were  assessed  during  the  period  from  October  10  to  December 
16,  1983.    Nine-year-olds  and  fourth  graders  were  assessed  from  January  2 
to  March  9,  1984.    The  last  group,  the  seventeen-year-olds  and  eleventh 
graders,  were  the  focus  of  assessment  activities  from    March  12  to  May  11, 
1984.    (In  four  schools,  makeup  sessions  were  scheduled  after  May  11  due  to 
poor  attendance  at  the  initial  sessions.)    Conduct  of  the  assessments  is 
described  in  Section  7.3. 

Quality  control  was  an  important  part  of  the  entire  field  effort.  In 
addition  to  the  field  monitoring  activities  described  in  Section  7.4, 
in-person  site  visits  were  made  to  a  sample  of  schools  and  an  additional 
sample  of  schools  was  interviewed  by  telephone.    These  quality  control 
activities  are  described  in  Section  7.3.5. 


135 


7.2    Pre-Assessment  Activities 


7.2.1    Establish  Field  Organization 

The  home  office  staff  involved  in  supervising  the  field  operations 
included  the  field  director  and  assistant  field  director.    The  field 
director  coordinated  all  field  operation  activities  in  the  home  office  and 
had  the  primary  reporting  relationship  with  half  of  the  district 
supervisors.    The  assistant  field  director  had  the  primary  reporting 
relationship  with  the  other  half  of  the  district  supervisors,  and  was  also 
responsible  for  materials  distribution  and  directing  the  receipt  of 
reporting  forms  (as  described  in  Section  7.4)  in  the  home  office. 

As  described  in  Chapter  4,  Sample  Selection  and  Instrument  Collection, 
64  counties  or  groups  of  counties  (primary  sampling  units,  or  PSUs)  were 
selected  for  the  Year  15  sample.    The  64  PSUs  were  then  grouped  into 
sixteen  major  regions  based  on  a  fairly  equal  geographic  spread  of  schools. 
In  June  1983,  Westat  recruited  sixteen  district  supervisors  and  five 
alternate  supervisors  to  assist  district  supervisors  when  there  were 
scheduling  conflicts.    Each  district  supervisor  was  assigned  one  of  the 
sixteen  major  regions. 

The  district  supervisor  was  responsible  for  a  variety  of  tasks.  During 
the  pre-assessment  phase,  the  district  supervisor  contacted  school 
districts  (after  the  initial  contact  vas  made  by  ETS)  and  arranged  for  an 
introductory  meeting  with  school  personnel;  conducted  the  introductory 
meeting  and  scheduled  each  school's  assessment;  and  recruited  exercise 
administrators  to  assist  in  the  conduct  of  the  assessments. 

During  the  assessment  phase,  the  district  supervisor  sampled  the 
students  to  be  assessed  in  each  school;  trained  and  provided  support  to  the 
exercise  administrator  who  conducted  the  assessment;  distributed  and 
collected  the  Excluded  Student  Questionnaires,  Teacher  Questionnaires,  and 
School  Characteristics  and  Policy  Questionnaire;  and  completed  all 
administrative  reporting  forms.    After  an  assessment  at  a  school  was 
complete,  the  district  supervisor  packed  and  shipped  all  school  materials 
to  ETS. 

Each  district  supervisor  hired  between  one  and  three  exercise 
administrators  per  PSU.    A  few  exceptions  were  made  in  regions  where  the 
schools  in  several  PSUs  were  clustered  in  large  metropolitan  areas.  In 
these  regions,  the  supervisors  hired  three  to  four  exercise  administrators 
who  worked  the  entire  metropolitan  area.    The  exercise  administrators 
assisted  the  supervisor  in  selecting  the  sample  of  students  to  be  assessed, 
conducted  the  assessment  sessions,  and  prepared  completed  exercise  booklets 
for  shipping. 

For  the  most  part,  staffing  of  the  field  organization  remained  fairly 
constant  throughout  the  field  period.    One  district  supervisor  was  replaced 
prior  to  the  assessment  phase  of  the  project.    There  was  approximately 


136 


ERLC 


15^ 


15  percent  attrition  among  the  exercise  administrators;  however,  this 
turnover  had  little,  if  any,  impact  on  the  co*.duct  of  the  wcrk. 

The  background  and  experience  of  the  district  supervisors  are 
summarized  in  Table  7(1).    As  can  be  seen  from  the  figure,  fourteen 
supervisors  lived  in  one  of  the  PSUs  of  their  region,  four  had  worked  oi 
NAEP  before  (two  as  supervisors,  two  as  exercise  administrators),  all  had 
had  some  supervisory  experience,  ten  had  v/orked  for  Westat  before  the  WE? 
project,  and  eight  had  an  educational  background  (teaching  or  edur.qticnal 
research). 


7.2.2    District  Supervisor  Training 

District  supervisors'  training  was  divided  into  two  parts.    Part  I, 
which  lasted  two  days,  introduced  the  study  and  explained  pre-assessment 
activities.  Part  II,  which  lasted  three  days,  was  devoted  to  actual 
assessment  activities.    Conducting  the  training  in  two  short  sessions 
rather  than  one  long  one  was  a  departure  from  past  practice.    With  two 
sessions,  each  session  could  focus  on  the  particular  tasks  at  hand  and  not 
present  too  much  detailed  information  at  once.    This  arrangement  also  gave 
Westat  hom^  office  staff  more  time  to  asse^^s  the  strengths  and  weaknesses 
of  the  supervisors  and  to  take  any  necessary  corrective  action. 

The  first  supervisors'  training  session  was  held  August  1-2,  1983. 
Training  was  conducted  by  the  Westat  project  director  and  field  director, 
witrh  introductory  remarks  and  explanatory  notes  made  by  ETS.    In  attendance 
wei-e  the  supervisors,  alternate  supervisors,  and  representatives  from  ETS' 
regional  offices  who  were  to  make  the  initial  contacts  with  school 
districts  to  solicit  participation.    The  topics  included  an  overview  of 
NAEP  and  the  supervisors'  responsibilities;  procedures  for  contacting 
schools  and  conducting  introductory  meetings;  planning  the  schedule  of 
assessments  within  PSUs;  and  recruiting  and  training  exercise 
administrators. 

Immediately  prior  to  the  assessment  phase  of  the  field  effort,  Westat 
and  ETS  staffs  reassembled  for  Part  II  of  supervisor  training,  held  October 
3-5,  1983.    Topics  included  training  and  supervising  exercise 
administrators;   the  student  sample  selection  process;  administrative 
procedures  for  conducting  the  assessments;  supervisory  responsibility  for 
quality  control  of  assessment  sessions  and  all  NAEP  materials;  and 
procedures  for  shipping  materials  and  reports  to  Westat  and  ETS.  This 
session  was  also  attended  by  Westat  staff  and  the  ETS  regional  staff  who 
would  be  conducting  in-person  quality  control  visits  to  sampled  schools  to 
verify  the  sampling  and  observe  the  supervisors  and  exercise  administrators 
at  work. 


137 


luU 


Table  7(1) 


Criteria  Met  by  NAEP  Supervisors  by  Supervisory  Region 


Prior 
Supervisory 
Experience 
with  Westat 

or  Other  Prior 
Lived  Within      Prior  NAEP    Prior  Westat      Research  Employment 
Region    Selected  PSU      Experience      Experience    Organization    in  Education 


I 

X 

X 

X 

II 

X 

X 

X 

X 

III 

X 

X 

X 

IV 

X 

X 

X 

V 

X 

X 

X 

X 

VI 

X 

X 

X 

VII 

X 

X 

X 

VIII 

X 

X 

X 

IX 

X 

X 

X 

X 

X 

X 

X 

X 

XI 

X 

X 

X 

X 

XII 

X 

X 

XIII 

X 

X 

X 

XIV 

X 

X 

X 

XV 

X 

X 

X 

X 

XVI 

X 

X 

X 

138 


161 


7.2.3    Solicit  Cooperation  of  School  DisM'ictg  r.nc  Sample  Schools 


7.2.3.1    Preliminary  Contacts 

During  June,  July,  and  August  1983,  ETS  and  Vestat  notified  the 
appropriate  state  and  local  school  officials  about  NAEP  and  requested  the 
cooperation  of  the  sample  schools.    The  activities  during  these  three 
months  are  discussed  in  detail  below. 

Recruiting  of  schools  for  NAEP  actually  began  in  June,  once  the  sample 
of  schools  had  been  selected  and  their  corresponding  school  districts 
identified.    ETS  contacted  the  chief  state  school  officers  in  each  state 
and  requested  them  to  notify  the  school  district  superintendents.    In  July, 
ETS  sent  a  letter  to  the  superintendents  and  heads  of  private  schools 
inviting  their  participation.    Under  separate  ccver,  informational  material 
on  NAEP  and,  if  applicable,  a  list  of  the  original  sample  schools  in  the 
district,  were  also  sent.    These  initial  contacts,  which  were  completed 
prior  to  supervisor  training,  paved  the  way  for  the  telephone  conta  ts  to 
follow. 

Immediately  after  training,  ETS  regional  staff  contacted  the 
superintendents  to  discuss  NAEP  further  and  to  obtain  their  cooperation. 
The  results  of  these  contacts  were  documented  on  the  Results  of  Contact 
Form.    Once  cooperation  had  been  determined,  ETS  staff  mailed  two  copies  of 
this  form  to  the  district  supervisor  and  one  copy  to  the  Vestat  home 
office. 

Upon  receipt  of  the  Results  of  Contact  Form,  the  district  supervisor 
called  the  contact  person  listed  on  the  form  to  arrange  for  an  introductory 
meeting  with  representatives  of  the  sample  schools  and  to  obtain  updated 
information  on  schools  in  the  district.    Any  new  school  openings,  school 
closings  or  changes  in  grade  or  enrollment  were  recorded  on  the  School 
Update  Form  and  sent  to  Vestat.    Changes  in  address,  principal  or  school 
name  were  recorded  on  the  copy  of  the  PSU  List  of  Schools  and  sent  to 
Vestat. 

Vhen  the  supervisor  and  school  district  or  private  school  official  had 
scheduled  the  introductory  meeting,  the  supervisor  completed  the  Schedule 
of  Introductory  Meetings  and  submitted  it  to  Vestat  so  that  Vestat  could, 
in  turn,  send  out  informational  packages  and  confirmation  letters  to  the 
appropriate  school  officials. 


7.2.3.2    Introductory  Meetings 

From  August  29  to  September  30,  1983,  the  district  supervisors  spent 
about  one  week  in  each  of  their  PSUs  conducting  introductory  meetings  with 
school  officials.    Although  the  primary  purpose  of  these  meetings  was  to 
explain  NAEP  in  more  detail  to  the  school  officials,  several  other  purposes 


139 


1  bV; 


were  served  as  well.    During  the  introductory  meeting,  the  supervisor  was 
responsible  for; 

*  answering  questions  about  NAEP; 

explaining  the  schools'  role  in  NAEP  and  distributing  the 
appropriate  Summary  of  School  Tasks; 

distributing  Student  Listing  Forms  and  explaining  their  use  and 
procedures  for  filling  them  out; 

setting  up  a  preliminary  schedule  for  assessments; 

♦  identifying  the  person  within  each  school  who  would  coordinate  all 
assessment  ac tivi t ies • 

collecting  and  reviewing  completed  Principal  Questionnaires; 

verifying  and  completing  the  School  Control  Form  with  each 
principal;  and 

•  obtaining  recommendations  for  exercise  administrators,  if  necessary. 

The  introductory  meetings  were  the  first  opportunity  for  principals  and 
other  school  officials  to  dj*=cuss  the  assessment  with  NAEP  staff.  Thus, 
tne  meetings  were  particular  y  important  for  establishing  rapport  with  the 
schools,  assuring  school  cooperation,  and  explaining  the  details  of  the 
schools'  tasks  to  the  individuals  responsible  for  them. 


7.2.3.3    Schools  Added  to  the  Original  Sample 

Due  to  a  variety  of  sampling  reasons  (described  in  Chapter  4),  it  was 
sometimes  necessary  to  add  schools  to  the  original  sample.    Because  the 
process  of  adding  schools  to  the  sample  did  not  begin  until  late  September, 
when  introductory  meetings  were  already  taking  place,  the  procedures  for 
contacting  and  gaining  cooperation  from  these  schools  necessarily  differed 
fror.  that  described  for  che  original  sample.    For  the  added  schools,  Westat 
first  mailed  a  letter  to  the  district  superintendents  and  heads  of  private 
schools.    Then,  the  district  supervisor  telephoned  the  contact  person  in 
the  superintendent's  office  and  asked  him  or  her  to  notify  the  sample 
schools.    Westat  then  mpiled  a  principal's  package  vith  Student  Listing 
Forms  and  the  Summary  of  School  Tasks  to  each  school.    After  three  to  four 
days  the  supervisor  called  the  school  and  conducted  the  introductory 
meeting  by  telephone.     ETS  regional  staff  provided  assistance  as  needed  in 
contacting  districts  and  individual  schools.     Whenever  in-person 
introductory  mf»eting  was  considered  essential  to  insure  cooperation,  the 
district  supervisor  scheduled  the  meeting  during  the  time  that  he  or  she 
would  be  in  the  PSU  for  the  first  round  of  assessments. 


140 


ERLC 


7*2.4    Recruit  and  Train  Exercise  Administrators 


An  important  part  of  the  supervisors'  pre-assessment  responsibilities 
was  to  hire  and  train  exercise  administrators,  the  persons  whose  primary 
function  it  was  to  administer  the  assessment  booklets  to  the  sample 
students.    District  supervisors  were  encouraged  to  use  their  own  discretion 
in  planning  for  and  hiring  exercise  administrators.    Westat  provided 
guidelines  for  the  number  of  exercise  administrators  to  be  hired  and  also 
the  names  of  possible  exercise  administrator  candidates  located  in  the 
supervisor's  PSUs. 

Supervisors  were  told  that,  in  general,  two  exercise  administrators 
should  be  hired  for  each  PSU,  although  a  variety  of  factors  might  influence 
the  actual  number.    The  number  of  schools  in  a  PSU,  the  size  of  the  student 
sample  in  each  school,  distances  to  be  traveled,  the  geography  of  the  area, 
and  weather  conditions  during  particular  times  of  the  year  were  all  factors 
considered  by  supervisors  in  developing  plans  for  exercise  administrators. 
A  few  supervisors  had  contiguous  PSUs;  they  hired  the  same  exercise 
administrators  to  work  in  all  of  their  PSUs.    Other  supervisors  had  PSUs 
where  schools  were  small  and  widely  scattered;  they  tended  to  hire  exercise 
administrators  to  work  only  a  portion  of  the  PSU. 

Candidates  for  the  exercise  administrator  positions  came  from  several 
sources.    Exercise  administrators  from  previous  assessments  applied  for  the 
jobs.    Westat  consulted  their  file  of  field  personnel  who  had' worked  on 
previous  Westat  stuu:es.    Supervisors  also  used  recommendations  from  school 
officials  for  uncovering  good  candidates.    If  necessary,  advertisements 
were  placed  in  local  newspapers. 

Supervisors  were  encouraged  to  hire  locally,  and  to  hlr-  individuals 
with  teaching  experience  or  the  ability  to  handle  classroom  situations. 
Many  exercise  administrators  were  retired  or  substitute  teachers. 

Training  the  exercise  administrators  was  one  of  the  supervisor's  first 
tasks  upon  arriving  in  the  PSU  before  beginning  the  assessments.    Prior  to 
the  supervisor's  arrival,  Westat  sent  trainees  the  Exercise  Administrator's 
Manual  which  described,  in  detail,  the  role  of  the  exercise  administrator 
and  procedures  to  be  followed.    Exercise  administrators  were  required  to 
study  the  manual  before  being  trained,  then  attend  a  half-day  training 
session  conducted  by  the  supervisor.    During  the  training,  the  supervisor 
reviewed,  in  detail,  all  aspects  of  the  exercise  administrator's  job, 
including  preparing  materials,  booklets  and  administration  schedules  for 
assessments;  the  actual  conduct  of  the  session;  post-assessment  collection 
of  booklets  and  pencils;  coding  of  booklet  covers;  recordkeeping;  and 
administrative  matters. 


7.3    Year  15  Assessments 

From  October  10,  1983  to  May  11,  198^,  the  assessments  were  conducted 
one  grade/age  at  a  time.  Each  supervisor  cycled  through  the  four  PSUs  in 
his  or  her  region,  complecing  all  assessment  activities  for  a  grade/age  ii 


141 


ERIC 


a  PSU  before  moving  on  to  the  next  PSU.    Ten  weeks  each  were  available  for 
supervisors  to  complete  Grade  4/Age  9  and  Grade  8/Age  13  assessments;  nine 
weeks  were  available  to  complete  Grade  11/Age  17.    In  general,  supervisors 
spent  from  two  to  two  and  one-half  weeks  for  each  grade/age  class 
assessment  in  each  FbU. 

Supervisors  developed  their  own  schedules  for  each  PSU  depending  upon 
the  size  and  location  of  schools,  the  number  of  students  to  be  assessed, 
and  any  special  situations  or  requests  by  the  schools  regarding  the  timing 
of  sessions.    School  holidays  and  requests  such  as  "not  on  Mondays  or 
Fridays,"  "only  in  the  mornings,"  "all  students  assessed  at  the  same  time," 
etc.  were  honored  by  supervisors  in  arranging  the  assessment  schedule. 
Such  requests  affected  not  only  the  assessment  schedule  but  also  the  number 
of  exercise  administrators  needed  at  each  school. 

Although  flexibility  had  to  be  the  hal]mark  of  assessment  scheduling, 
supervisors  generally  followed  the  work  plan  detailed  in  their  field 
procedures.    In  essence,  this  plan  involves  the  following  order  of 
supervisory  activities  upon  arriving  in  a  PSU:    meet  exercise 
administrator.5  and  as  part  of  their  training,  take  them  to  a  school  to 
observe  sampling;  complete  exercise  administrator  training;  draw  samples  in 
one  or  two  other  schools  with  exercise  administrators;  begin  assessments  in 
the  first  school  and  observe  exercise  administrators;  sample  other  schools 
^>,hile  exercise  administrators  continue  assessments.    Where  feasible,  the 
supervisors  went  to  each  school  on  assessment  day  to  confirm  all 
arrangements  and  initiate  activity.    Depending  upon  the  number  of  schools 
in  the  assessment,  the  supervisor  would  schedule  sampling  and  assessments 
on  different  days  so  that  he  or  she  could  be  present  at  all  assessments. 
Similarly,  most  supervisors  found  it  very  useful  to  have  at  least  one  of 
the  exercise  administrators  assist  vith  sampling.    The  supervisor  wouui  do 
the  artual  sampling  while  the  exercise  administrator  would  double-checK  the 
forms,  fill  out  administration  schedules,  and  check  the  school  files  for 
other  data,  if  necessary. 

In  addition  co  the  activities  listed  above,  the  supervisors  contacted 
schools  in  the  next  PSU  to  establish  the  assessment  schedule;  called 
schools  in  the  current  PSU  to  confirm  actual  assessment  dates;  made  return 
trips  to  rchools  where  assessments  had  been  completed  to  pick  up  survey 
forms  that  had  not  been  finished  at  the  time  of  assessment;  and  edited, 
boxed  and  shipped  completed  assessment  materials. 


7.3.1    Drawing  the  Sample  of  Students 

Supervisors  called  each  school  seven  to  ten  days  prior  to  the 
assessment  to  confirm  all  arrangements  for  sampling  and  assessment.  Tne 
time  between  sampling  and  assessment  was,  on  the  average,  about  four  days, 
depending  upon  the  school's  time  constraints  for  notifying  parents, 
teachers,  and  students. 

For  those  Grade  11/Age  17  schools  '  ith  at  least  three  sessions, 
supervisors  were  encouraged  to  draw  the  sample  during  the  Grade  4/Age  9 


142 


ERIC 


assessment  because  there  is  less  time  available  in  the  spring  for  Grade 

iJ  thpi  .^^'T'-^""'"  ''"^^  ^'^^  ^^"d  '°  be  large,  sampling 

m  these  schools  is  more  time-consuming  than  in  smaller  schools.  All 
supervisors  tried  early  sampling;  some  abandoned  it  for  various  reasons. 
In  some  schools,  because  of  either  geographic  location  or  high  rate  of 
turnover  m  the  student  body,  it  did  not  make  sense  to  attempt  to  sample 

Sample  selection  for  Year  15  was  more  complicated  than  for  previous 
in  ?L  ^nnlr  'T?"'-  included  as  eligibles  all  students 

M^rJ  !i-  -K-n?^       ,as_well_as  those  who  were  age-eligible  (the  previous 
NAEP  eligibility  criterion).    Grade  eligibles  vere  included  for  ?be  first 
time  so  that  the  data  could  be  analyzed  by  grade  as  well  as  by  age. 
Supervisors  had  to  check  the  Student  Listing  Forms  carefully  to  make  sure 
that  only  eligibles  were  included  and  that  all  eligibles  were  included. 
This  proved  to  be  important  because,  in  several  instances,  supervisors 
discovered  that  schools  had  listed  only  students  vhc  vere  both  age-  and 
grade-eligible.    Similarly,  supervisors  frequently  found  oI^T^r  two 
students  who  were  erroneously  listed  on  the  Student  Listing  Form.    This  was 

clllru'All  °^  '^"^^  The  age'definition 

Grade  11/Age  1/  spans  two  calendar  years  (October  1966  to  September  1967): 

vpHr'hoS^^         birthdates  was  more  time-consuming  because  both  month  and 
year  had  to  be  checked.    Also,  some  of  the  students  sampled  as  11th  graders 
during  the  early  sampling  in  winter  had  been  promoted  to  the  i^th  e-ade  st 
nid-year  (before  the  assessment  in  spring),  modifying  their  eligibility 

?I%^rc.  ^'.^^^^  ^^^^^^  promoted  to  the  11th  grade  had  to  be  added 

to  the  Student  Listing  Form  and  given  a  chance  of  selection. 

The  second  factor  complicating  sampling  was  the  addition  of  spiral 
lllVilT  existing  practice  of  tape  sessions. Students  had  to  be 

sampled  at  different  rates  for  spiral  and  tape  sessions  and  ^nly 
age-eligible  students  were  eligible  for  tape  sessions.  Thus,  supervisors 
sampled  for  spiral  sessions  first.    Then,  renumbering  those  age-eligible 
students  who  had  not  been  sampled  for  spiral,  supervisors  selected  the  tape 
session  students. 

Instructions  for  sampling  were  provided  in  the  Supervisor's  Manual. 
Because  of  the  attention  to  detail  required  in  the  sampling,  supervisors 
were  required  to  do  all  sampling  themselves  and  could  not  delegate  this 
responsibility  to  exercise  administrators  except  under  extraordinary 
circumstances  which  had  to  be  reviewed  with  the  field  director. 

^The  modal  grade  is  the  grade  attended  by  the  meiority  of  age-eligible 
students,  that  is,  the  4th  grade  for  9-year-olds,  the  8ch  grade  for 
13-year-olds  and  the  11th  grade  for  17-year-olds. 

Jin  the  tape  sessions,  all  students  were  administered  the  same  type  of 

receivpH^     ^  ^"         ^P^^"^^  sessions,  each  student 

received  one  of  63  different  self-administered  booklets,    chapter  4 
provides  more  information  regarding  tape  and  spiral  sessions. 

143 


Sampling  was  monitored  by  Westat  statistical  staff  in  several  ways, 
including  through  the  design  of  the  sampling  instructions  that  were  sent  to 
supervisors  (the  Session  Assignment  Forms).    Using  school  enrollment 
information  on  the  Principal  Questionnaire,  the  Session  Assignment 
Form  for  each  school  provided  a  range  within  which  the  count  of  names  on 
the  Student  Listing  Form  had  to  fall.    If  the  count  of  names  exceeded 
either  the  upper  or  lower  limil,  the  supervisor  had  to  call  Westat.  Gross 
errors  in  preparing  the  Student  Listing  Form  could  be  detected  at  this 
stage.    For  example,  if  a  school  included  only  students  in  the  grade  who 
were  age-eligible,  the  number  of  names  on  the  Student  Listing  Form  would 
usually  fall  below  the  lower  limit  on  the  Session  Assignment  Form. 

In  addition,  each  supervisor  was  required  to  report  by  telephone  the 
following  information  to  the  statistical  staff  for  the  first  school 
sampled: 

(1)  PSU; 

(2)  school  ID  number; 

(3)  total  of  students  listed  on  the  Student  Listing  Form,  including 
any  lined  out; 

(^)    total  of  students  lined  out  on  the  Student  Listing  Form: 

(5)  last  line  number  on  the  Student  Listing  Form; 

(6)  total  of  students  selected  for  spiral  session(s),  excluding 
any  lined  out; 

(7)  if  tape  session  in  school,  the  number  of  age-eligible  students 
(e.g.,  13-year-olds),  excluding  those  lined  out  and  sampled  for  a 
spiral  session;  and 

(8)  total  of  students  selected  for  tape  session(s),  excluding 
any  lined  out. 

Using  this  information  and  the  sampling  rates  specified  on  the  Student 
Listing  Form,  the  statistical  staff  checked  that  the  sampling  had  been 
carried  out  correctly.    The  statistical  staff  also  was  available  to  answer 
questions  from  the  supervisors. 

Verifying  the  sample  was  also  a  primary  focus  of  the  quality  control 
visits  made  by  Westat  and  ETS  staff.    With  very  few  exceptions,  supervisors 
carried  out  their  sampling  responsibilities  carefully  and  conscientiously. 
In  the  case  of  one  supervisor,  it  was  felt  that  additional  site  visits  by 
Westat  staff  were  necessary  until  satisfactory  performance  was  assured. 
The  detail.'^  of  these  visits  are  discussed  in  the  Report  on  Sample 
Selection,  Weighting  and  Variance  Estimation:    NAEP  Year  15  (Lago,  Burke, 
Tepping,  &  Hansen,  1985). 


144 


167 


7.3.2    Conduct  of  the  Assessments 


It  was,  periaps,  in  the  arrangements  for  the  actual  conduct  of  the 
assessment  sess  ons  that  the  district  supervisors  and  exercise 
administrators  had  to  be  the  most  flexible  and  diplomatic.    The  physical 
space  and  time  available  in  the  schools  often  did  not  meet  the  ideal  as 
specified  in  the  manuals.    In  general,  elementary  schools  were  the  most 
flexible  and  were  willing  to  adapt  to  the  district  supervisor's  schedule. 
This  was  fortunate  because  the  Grade  4/Age  9  students  were  assessed  during 
the  winter  and  there  were  times  when  supervisors  had  to  cancel  and 
reschedule  sessions  because  of  inclement  weather.    The  junior  and  senior 
high  schools  were  less  flexible  and  were  more  likely  to  make  special 
requests  for  scheduling  and  timing.    For  example,  some  large  schools  wanted 
all  spiral  sessions  administered  at  the  same  time  in  an  auditorium.  To 
accommodate  this,  the  supervisor  acted  as  an  exercise  administrator  and 
sometimes  had  to  train  the  school's  teachers  in  NAEP  procedures  so  they 
could  act  as  exercise  administrators.    A  session  typically  ran  about  one 
hour,  and  only  one  school  required  that  the  assessment  be  done  within  the 
time  limits  of  its  40-minute  class  periods.  To  accommodate  this,  the 
Introduction  and  Part  I  of  the  booklets  were  administered  one  day  and  Parts 
II~IV  were  administered  the  following  day. 

Another  school  request  wh±ch  demanded  flexibility  on  the  part  of 
supervisors  and  modified  procedures  was  that  all  eligible  students  be 
assessed,  not  just  those  who  were  sampled.    Schools  generally  made  this 
request  when  the  sample  of  students  to  be  assessed  included  all  but  a  few 
student'   in  a  class.    In  these  cases,  the  school  preferred  that  the  whole 
class  be  assessed  so  that  the  teacher  could  do  other  things  and  the 
not-in-sample  students  would  not  feel  that  they  had  been  excluded  for  some 
reason. 

Although  supervisors  had  to  be  flexible  in  arranging  and  staffing 
sessions,  the  schedule  of  activities  on  the  day  of  assessment  was  standard. 
The  supervisor  and  exercise  administrators  would  arrive  early  at  the  school 
to  meet  with  the  coordinator  and  review  the  assessment  plan.    The  exercise 
administrators  (and  supervisor  if  he  or  she  would  be  conducting  sessions) 
would  assign  booklet  numbers  to  the  students  listed  on  each  Administration 
Schedule  (a  listing  of  the  names,  ages  and  sex  of  every  student  invited  to 
a  session),  as  specified  in  the  manual.    They  would  then  go  to  the  assigned 
location  for  the  first  session  and  wait  for  the  students  to  arrive. 

Supervisors  found  it  very  helpful  to  have  as  coordinator  someone  who 
was  interested  in  NAEP  and  willing  to  make  sure  that  students  got  to  the 
appropriate  sessions.    By  emphasizing  that  makeup  sessions  would  have  to  be 
scheduled  if  attendance  was  low^  supervisors  were  often  able  to  galvanize 
the  coordinators  into  action  to  get  the  students  to  the  appropriate 
sessions.    Actively  involved  coordinators  made  sure  that  teachers  and 
students  knew  about  NAEP,  used  the  public  address  system  to  announce 
sessions  and  call  out  the  names  of  missing  students,  and  went  from 
classroom  to  classroom  to  hunt  for  missing  students.    If  the  supervisor  was 
not  conducting  sessions,  he  or  she  would  do  some  of  these  same  things  and 
encourage  the  school  staff  to  make  every  effort  to  increase  attendance. 


1A5 


Makeup  sessions  were  required  for  tape  sessions  whenever  attendance  at 
a  single  tape  session  was  50  percent  or  less  for  the  Grade  4/Age  9  and 
Grade  8/Age  13  schools  and  75  percent  or  less  for  the  Grade  11/Age  17 
schools.    Makeup  sessions  were  required  for  spiral  sessions  whenever 
attendance  at  a  school's  combined  sessions  was  75  percent  or  less. 
Information  on  makeup  sessions  is  summarized  in  Table  7(2). 

In  Year  15,  as  in  pr<?vious  years  of  NAEP,  makeup  sessions  were  required 
most  frequently  during  the  Grade  11/Age  17  assessment.    In  fact,  makeups 
were  required  in  less  than  one  percent  of  the  Grade  4/Age  9  and  Grade  8/Age 
13  schools,  but  in  slightly  over  20  percent  of  the  Grade  11/Age  17  schools. 
Even  though  the  attendance  rate  requirement  was  the  same  for  spiral  and 
tape  sessions,  of  the  Grade  11/Age  17  schools,  about  19  percent  of  those 
with  spiral  sessions  required  makeups  while  about  24  percent  of  those  with 
tape  sessions  scheduled  makeups.    This  higher  makeup  rate  for  tape  sessions 
may  have  resulted  because  the  attendance  requirement  applied  to  each  tape 
session  individually  but  to  the  spiral  sessions  combined.    Also,  a  student 
couM  attend  any  of  the  spiral  sessions  but,  if  sampled  tor  tape,  he  or  she 
had  to  attend  the  specified  session. 

As  shown  in  Table  7(3),  makeup  sessions  represented  a  small  proportion 
of  the  number  of  sessions  conducted.    ^  en  for  Grado  11/Age  17,  makeups 
were  less  than  10  percent  of  all  sessions  conducted  (6  percent  of  all 
spiral  sessions  and  20  percent  of  all  tape  sessions). 

As  shown  in  Table  7(3),  a  total  of  161  makeup  sessions  were  held:  95 
spiral  and  66  tape.    The  purpose  of  makeup  sessions  was  to  improve  the 
response  (i.e.,  attendance)  rate  for  Year  15;  the  actual  effect  of  makeup 
sessions  on  response  rates  is  shown  in  Table  7(4). 

Because  only  six  spiral  and  no  tape  makeup  sessions  were  required  for 
Grade  4/Age  9  and  Grade  8/Age  13,  the  impact  on  overall  attendance  rates 
vas  minimal.    However,  for  Grade  11/Age  17,  where  about  9  percent  of  all 
sessions  were  makeups,  there  were  significant  increases  in  the  response 
rate. 


7.3.3    Students  Sampled,  Inviteo  and  Assessed 

As  mentioned  earlier  and  described  in  Chapter  4,  the  combined  use  of 
tape  and  spiral  sessions,  along  with  the  introduction  of  grade  as  well  as 
age  samples,  complicated  the  sampling  process.    A  target  number  of  students 
completing  assessments  was  established  for  each  age  group  separately  for 
spiral  and  tape  samples.    Then,  using  data  from  previous  assessments  on 
percent  excluded  and  response  rates,  sample  sizes  were  determined  for  Year 
15.    As  shown  in  Table  4(5)  (Chapter  4),  the  actual  numbers  of  students 
assessed  were  considerably  higher  than  the  target  numbers  for  each 
grade/age. 


46 


Grade/Age 

4/9 
8/13 
11/17 
Total 


Table  7(2) 
Frequency  of  Makeup  Sessions 

Number  of  Schools 


With 
Sessions 

661 
478 
326 
1465 


With  Makeup 
Sessions 

2 

4 
67 
73 


Percent  with 
Makeup  Sessions 

0.3 
0.8 
20.6 
5.0 


Tab" _  7(3) 
Regular  and  Makeup  Sessions  Conducted 


Number  of  Percent  of 

Number  of  Sessions       Makeup  Sessions  Makeup  Sessions 

Grade/Age      Spiral  Tape  Total       Spiral  rape  Total       Spiral  Tape  Total 


4/9 

1330 

260 

1590 

2 

0 

2 

0, 

.2 

0, 

.0 

0.1 

0/13 

1317 

261 

1578 

4 

0 

4 

0. 

,3 

0, 

.0 

0.3 

11/17 

1416 

324 

1740 

89 

66 

155 

6. 

,2 

20. 

,4 

8.9 

Total 

4063 

845 

4908 

95 

66 

161 

2. 

3 

7. 

,8 

3.3 

Table  7(4) 

Change  in  Attendance  Rates  With  Makeup  Sessions 

Change  in  Rates  (X) 
Grado/Age  Spiral  Sess.'ons    Tape  Sessions 

4/9  ,T  0 

8/13  +1  0 

11/17  ^4  ^4 

Overall  +1  ^2 


147 


IVu 


7.3.4    Supervisors^  Other  Assessment-Related  Tasks 

A  variety  of  other  tasks  were  undertaken  by  the  district  supervisors  to 
assure  the  successful  completion  of  the  assessments  and  to  gather  other 
survey  data  required  by  NAEP.    Among  these  supervisory  tasks  were 
completing  assessment  reporting  forms;  finalizing  arrangements  for  the 
assessments;  supervising  exercise  administrators;  distributing  and 
collecting  other  data  forms  and  questionnaires?  editing,  boxing  and 
shipping  assessment  materials:  mailing  thank-you  letters  to  coordinators; 
and  filling  out  a  project  evaluation  for  Westat. 

When  sampling  was  completed,  the  supervisor  and/or  thv  exercise 
administrators  filled  out  an  Administration  Schedule  for  each  assessment 
session  to  be  held  in  the  school.    The  administration  schedules  were  the 
student  rosters  for  the  assessment  sessions.    They  identified  which 
students  were  to  attend  each  session  and  the  time  and  location  of  the 
sessions.    Some  schools  used  copies  of  the  Administration  Schedules  to 
notify  teachers  and  students.    Others  wanted  an  appointment  card  for  each 
student,  which  the  exercise  administrators  filled  out  from  the 
Administration  Schedule. 

The  Supervisor  also  filled  out  a  School  Worksheet,  containing 
information  on  the  number  of  students  absent  and  assessed.    Because  of  the 
variety  of  forms  and  materials  pertaining  to  each  school,  Wsstat  developed 
a  school  folder  which  could  be  used  by  the  supervisor  to  keep  all  materials 
pertaining  to  a  school. 

7.3.4.1    Finalizing  Arrangements  for  the  Assessments 

The  process  of  finrlizing  arrangements  for  the  assessment"  sessions 
began  prior  to  the  introductory  meeting.    The  supervisor  developed  a 
general  plan  for  completing  all  the  assessments  in  a  PSU,  taking  into 
consideration  each  school's  geographic  location  and  number  of  sessions*  At 
the  introductory  meeting,  each  school's  schedule  and  constraints  were 
identified  and  a  tentative  date  established.  In  general,  this  date 
specified  the  week  the  assessment  would  occur,  since  the  supervisor  was 
advised  to  wait  until  all  meetings  had  been  held  and  the  schedule  for  all 
schools  knovn  before  setting  up  specific  dates  with  schools-    Some  schools, 
however,  insisted  that  the  actual  dates  of  assessment  be  set  at  the  time  of 
the  introductory  meeting. 

At  the  introductory  meeting,  the  supervisor  completed  a  School  Control 
Form  to  let  the  home  office  know  the  schedule  of  assessments.    Westat  then 
sent  a  confirmation  memo  to  the  schools  and  a  reminder  letter  about  tvo 
weeks  prior  to  the  assessment. 

Seven  to  ten  days  before  the  assessment  week,  the  district  supervisor 
called  the  school  coordinator  to  establish  (cr  confirm)  the  definite  dates 
for  sampling  and  assessment.    At  the  time  of  sampling,  dates  and  times  for 
the  individual  sessions  wer^  confirmed  and  recorded  on  the  Administration 


148 


ERLC 


171 


Schedules.  Depending  upon  the  time  lag  between  sampling  and  assess.uent , 
the  supervisor  would  contact  the  school  coordinator  one  or  more  times  to 
confirm  arrangements. 

Since  district  supervisors  were  busy  in  schools  and  were  hard  to  reach 
during  the  day,  schools  were  instructed  to  call  Westat  home  office  staff  if 
they  needed  to  get  in  touch  with  the  supervisor.    Home  office  staff 
received  an  average  of  three  to  five  calls  per  day  from  schools  with 
questions  or  requests  for  schedule  changes.    If  possible,  home  office  staff 
resolved  their  questions.    If  necessary,  calls  were  made  to  the 
supervisor's  home  or  hotel,  or  even  to  the  school  where  he  or  she  was 
working. 

On  the  day  of  an  assessment,  the  supervisor  usually  went  to  the  school 
to  oversee  all  assessment  activities,  handle  any  special  situations  that 
arose  and,  if  necessary,  make  minor  changes  in  the  location  or  time  of 
sessions. 


7»3.4.2    Supervising  Exercise  Administrators 

Supervisors  were  responsible  for  the  work  of  their  exercise 
administrators.    Because  the  supervisor  was  frequently  in  the  school,  at 
least  through  the  first  assessments  of  the  day,  he  or  she  had  ainplc 
opportunity  to  observe  the  exercise  administrators  at  work.^  It  was 
mandatory  that  the  supervisors  observe  the  first  assessment  sessions 
conducted  by  each  exercise  administrator  and  review  the  exercise 
administrator's  coding  of  booklets.    Supervisors  reported  each  observati< 
of  an  exercise  administrator  on  a  Weekly  Status  Report. 

District  supervisors  had  the  autborily  to  dismiss  exercise 
administrators  they  considered  incompetent  and  to  retrain  exercise 
administrators  as  necessary.  Supervisors  took  this  responsibility 
seriously;  in  general,  exercise  administrators  conducted  sessions 
competently  and  with  minimum  disruption  to  the  schools. 


7.3*4.3    Distributing  and  Collecting  NAEP  Questionnaires 

The  School  Characteristics  and  Policy  Questionnaire,  Excluded  Student 
Questionnaire  and  Teacher  Questionnaire  were  distributed  in  the  schools  to 
be  completed  by  school  personnel.    If  these  forms  were  completed  in  time, 
the  supervisor  collected  them  and  shipped  them  to  ETS.    The  School 
Characteristics  and  Policy  Questionnaire  was  mailed  by  Westat  to  the  school 
prior  to  the  assessment  with  the  confirmation  memo.    All  other  forms  were 
distributed  to  the  school  at  the  time  of  the  assessment  by  the  supervisor. 

An  Excluded  Student  Questionnaire  was  to  be  filled  out  for  every 
student  who  was  sampled  but  was  ineligible  or  excluded  fron«  the  assessment. 
The  majority  of  excluded  students  were  those  who  were  determined  by  the 
school  to  be  unable  to  participate  in  NAEP  because  they  we*"e  of  limited 
English-speaking  abilit^'    educable  mentally  retarded,  or  functionally 


149 


ERIC 


disabled.    For  each  of  these  students,  the  supervisor  gave  a  questionnaire 
to  the  coordinator  and  asked  that  it  be  filled  out  by  a  teacher  of  the 
student.    If  a  student  was  excluded  because  he  or  she  was  no  longer 
enrolled  in  the  school  or  had  been  sampled  although  ineligible  for  the 
study,  the  supervisor  filled  out  the  form.    Year  15  is  the  first  year  that 
detailed  data  have  been  obtained  on  the  excluded  students  (see  Chapter  6). 

The  Teacher  Questionnaire  is  also  new  with  Year  15.    For  this  survey,  a 
subsample  of  students  sampled  for  spiral  sessions  was  identified.  The 
subsample  was  equal  to  the  number  of  spiral  sessions  assigned  to  the 
school.    Thus,  if  there  were  six  spiral  sessions  assigned  to  a  school,  a 
subsample  of  six  students  already  sampled  for  spiral  sessions  would  be 
selected.    The  school  coordinator  was  asked  to  identify  the  English  or 
Language  Arts  teacher  of  each  student  so  identified.    Those  teachers  were 
asked  to  complete  a  Teacher  Questionnaire. 

The  supervisor  attempted  to  obtain  completed  questionnaires  from  the 
school  by  the  time  he  or  she  completed  other  assessment  activities.  If 
school  staff  could  not  give  compl  ^ed  forms  to  the  supervisor,  the 
supervisor  left  an  envelope  for  the  coordinator  to  mail  completed  forms 
to  ETS. 

Initial  response  for  all  three  questionnaires  was  very  good.  Overall, 
92.7  percent  of  the  Excluded  Student  Questionnaires,  88.7  percent  of  the 
School  Characteristics  and  Policy  Questionnaires  and  86.9  percent  of  the 
Teacher  Questionnaires  that  had  been  distributed  were  collected  and 
returned  by  the  supervisors  to  ^TS.    Response,  although  high,  was  lowest 
for  the  teacher  survey.    This  may  have  been  because  the  questionnaires  were 
passeO  from  the  supervisor  to  the  school  coordinator  to  the  teachers, 
creating  greater  opportunity  for  the  questionnaires  to  be  misplaced  and 
greater  difficulty  in  collecting  completed  questiorinaires. 

7.3.4.4    Editing,,  Boxing,  and  Shipping  Assessment  Materials 

Selected  items  from  the  Administration  Schedule  were  coded  onto  the 
frcpt  cover  of  the  assessment  booklets.    This  responsibility  was  shared  by 
supervisors  and  exercise  administrators,  although  supervisors  had  to  review 
all  work  completed  by  the  exercise  administrators.    Supervisors  were  to 
ship  to  ETS  all  assessment  materials  for  a  school  within  a  week  of 
completing  the  assessment  in  that  school.    At  the  end  of  assessments  in 
each  PSU,  supervisors  shipped  PSU-specific  materials  to  ETS;  at  the  end  of 
assessments  for  a  grade/age,  the  tapes  and  forms  specific  to  that  grade/age 
were  shipped  back.    At  the  end  of  the  field  period,  all  materials  were 
either  returned  to  ETS,  returned  to  Westat,  or  discarded. 

For  materials  returned  to  ETS,  district  supervisors  comp.eted  and 
mailed  separately  to  ETS  a  pre-printed  postcard  upon  which  they  recorded 
the  shipment  date,  PSU  number,  school  number,  number  of  cartons  shipped  and 
the  modo  of  shipment  (U.S.  mail.  United  Parcel,  etc.). 


150 


erIc 


If  after  seven  days  from  receipt  of  the  mail-alert  postcard  the 
materials  had  not  been  received  at  ETS,  Westat  was  notified.    Uestat  in 
tuia  contacted  the  district  supervisor  to  begin  the  process  of  tracing  the 
shipment.    Fifty-five  assessment  books  were  lost  or  damaged  in  transit. 


7.3.4.5    Close-Out  Activities 

At  the  end  of  the  field  period,  district  supervisors  were  sent  copies 
of  a  thank-you  letter  to  the  school  coordinators.    This  letter  was  signed 
and  sent  by  the  supervisor  as  a  personal  thanks  to  the  coordinator.  At  the 
same  time,  Westat  mailed  letters  of  appreciation  to  superintendents  and 
heads  of  private  schools.    School  principals  were  sent  a  certificate  of 
appreciation  from  ETS. 

As  a  final  task,  district  supervisors  were  asked  to  complete  and  return 
to  Westat  an  evaluation  of  Y:^ar  15  field  activities.    The  recommendations 
made  by  the  supervisors  will  be  incorporated  into  future  assessments. 


7.3.5    Quality  Control  and  Evaluation  Studies 

There  were  two  specifically  designed  quality  control  studies  of  the 
field  effort.    The  first,  and  most  intensive,  involvpd  on-snte  visits  by 
Westat  and  ETS  staff  to  verify  the  sampling  and  to  observe  the  supervisors 
and  exercise  administrators  as  they  conducted  assessments.  The  second  study 
was  a  telephone  survey  of  a  10  percent  sample  of  schools.    This  survey  took 
place  after  the  field  period  had  ended  and  all  assessment  activities  had 
been  completed  in  the  schools.    As  part  of  the  telephone  survey,  the  school 
coordinators  were  thanked  for  the  school's  participation  and  asked  about 
their  experiences  with  NAEP  and  the  field  staff. 


7.3.5.1    On-Site  Quality  Control  Visits 

At  the  beginning  of  each  grade/age  assessment,  a  sample  of  schools  was 
selected  for  quality  control  visits  by  Vestat  and  ETS  staffs.    The  purpose 
of  these  visits  was  twofold;     first,  they  provided  data  fiom  which  rough 
estimates  could  be  made  of  the  quality  of  assessment  activities, 
particularly  the  sample  selection  of  students.    The  second  purpose  of  the 
quality  control  visits  was  to  observe,  in  person,  the  work  of  supervisors 
and  exercise  administrators  to  identify  and  correct  areas  of  confusion  or 
error. 

The  design  of  the  sample  of  schools  for  the  quality  control  visits 
sought  to  satisfy  both  purposes  of  the  visits  through  a  combination  of 
purposive  and  probability  sampling.    The  probability  sample  was  designed  to 
provide  data  to  assess  the  quality  of  assessment  activities.    The  purposive 
sample  (where  Westat  field  management  specified  which  supervisors  should  be 
visited)  allowed  judgmental  selection  of  those  supervisors  who  we  thought 
could  benefit  from  further  observation. 


151 


ERLC 


174 


The  number  and  distribution  of  quality  control  visits  among  grade/ages 
is  provided  in  Table  7(5). 

Because  of  the  importance  of  identifying  problem  areas  and  taking 
corrective  action  as  quickly  as  possible,  half  of  all  quality  control 
visits  (two  thirds  of  the  purposive  visits)  were  scheduled  from  October  to 
December  1983,  when  the  Grade  8/Age  13  assessments  were  taking  place. 
Thirty-two  schools  were  visited  during  that  period,  sixteen  by  Westat  and 
sixteen  by  ETS.    Each  supervisor  was  visited  twice,  once  by  ETS  and  once  by 
Westat. 

During  the  next  grade/age  assessnient.  Grade  4/Age  9,  visits  were  made 
to  schools  of  twelve  of  the  sixteen  supervisors.  Schools  in  eight  of  the 
supervisory  regions  were  selected  at  random;  the  remaining  four  were 

Elected  purposively.    In  anticipation  of  special  problems  in  high  schools, 
given  their  size  and  relatively  lower  attendance  rates  in  previous  years  of 
NAEP,   the  number  of  quality  control  visits  during  this  third  grade/age  was 
increased  to  20  so  that  each  supervisor's  region  was  visited  at  least  once, 
and  four  were  visited  more  than  once. 

In  general,  the  visits  vent  well.    (The  Report  on  Sample  Selection 
provides  more  specific  results  of  the  quality  control  visits.)  The 
sampling  problems  that  were  identified  tended  to  be  occasional  random 
A*t€i  ¥^  />ot-ai oecnocc  r-athor  than  «3v<: f pni;i f  1  p  prrors  reflectincT  a 
misunderstanding  of  procedures.    Similarly,  the  kinds  of  procedural 
mistakes  made  by  supervisors  and  exercise  administrators  tended  to  be 
stylistic  (not  speaking  loud  enough,  not  following  the  prescribed  script) 
rather  than  a  result  of  misunderstanding.    These  issues  were  discussed  with 
the  individual  supervisor  as  they  occurred.    If  applicable,  a  general  field 
memorandum  was  prepared  on  specific  issues  and  distributed  to  all 
supervisors. 


7.3.5.2    Telephone  Survey 

In  early  May,  Westat  selected  a  10  percent  sample  of  the  participating 
schools  for  inclusion  in  a  telephone  survey.    The  purpose  of  the  telephone 
survey  was  to  give  the  principals  or  school  coordinators  an  opportunity  to 
comment  and  make  suggestions  about  operational  procedures  and  the  conduct 
of  the  field  staff.    Details  concerning  the  telephone  survey  and 
questionnaire  are  contained  in  the  Report  on  Field  Operations . 


7.4    Field  Management 

Various  administrative  reporting  forms  were  developed  foi 
pre-assessment  and  assessment  activities  in  order  to  monitor  the  progress 
of  work.    These  forms  and  how  they  were  used  are  described  in  tUis  section. 
Copies  of  these  forms  can  be  found  in  the  Report  on  Field  Operations. 


152 


175 


Table  7(5) 

Number  and  Distribution  of  Quality  Control  Visits 


Number  of  Schools  in  Sample 

Grade/ Age                                       Purposive  Probabili  .^y  BotH 

8/13                                                16  16  32 

4/9                                                   ^  e  12 

^1/17                                              _4  IC  20 

Total                                              2U  40  64 


153 


17u 


7.A.1    Monitoring  Field  Activities 


Several  approaches  were  taken  to  monitor  the  progress  of  work  during 
the  pre-assessment  and  apsessment  phases. 

During  the  pre-assessment  activities  (arrangements  for  and  conduct  of 
Introductory  meetings),  the  district  supervisor^  reported  to  the  field 
director  at  '/estat  at  least  once  a  week  to  review  progress  in  scheduling 
introductory  meetings  and  to  discuss  any  problems  or  difficulties.  Each 
week  the  receipt  clerk  reported  to  the  field  director  the  number  of  Results 
of  Contact  Forms  received  ai  "-^Jtat.    Tnis  report  provided  a  good 
indication  of  the  progress  ol  the  contacts  and  scheduled  meetings.  In 
addition,  an  automated  management  system  was  developsid  which  contained  a 
record  for  each  sampled  school.    A  disposition  code  structure  was  developed 
to  indicate  the  status  of  the  school's  participation  (e.g.,  cooperating, 
school  refusal,  district  refusal,  school  closed,  school  dropped—no  age 
eligibles,  etc.).    When  a  Schedule  of  Introductory  Meetings  was  received  at 
Westat,  the  receipt  clerk  keyed  a  cooperating  disposition  code  for  each 
school  invited  to  attend  the  introductory  meeting.    If  a  school  or  school 
district  refused,  as  noted  on  the  Results  of  Contact  Form,  a  refusing 
disposition  code  was  keyed  for  each  refusing  school.    The  School  Update 
Form  was  the  source  of  disposition  status  for  schools  that  closed,  had  a 
grade  or  enrollment  change  that  made  them  ineligible,  or  had  no 
age-eligible  students. 

Disposition  reports  were  generated  from  the  receipt  system  at  least 
once  a  week  to  review  the  progress  of  securing  cooperation  from  the  sampled 
schools.    Four  different  reports  were  generated  during  the  pre-assessment 
activities.    The  first  report  gave  a  breakdown  of  the  number  of  schools  by 
disposition  code  in  each  PSU.    The  second  report  listed  the  ID  number  and 
school  name  for  each  cooperating  school  for  which  a  Principal  Questionnaire 
had  not  bepn  received.    The  third  report  listed  the  ID  number  and  school 
name  for  each  school  without  a  disposition  code.    The  fourth  report  listed 
the  ID  numbers  and  school  names  for  non-cooperating  schools  (refusals, 
school  closed,  no  age  eligibles)  ?.nd  their  disposition  codes. 

These  reports  were  an  invaluable  tool  for  the  sampling  statisticians  as 
well  as  for  the  field  director  and  assistant  field  director.    They  provided 
the  statisticians  with  the  information  needed  to  determine  whether  the 
sami>le  of  schools  was  adequate  to  produce  representative  results.    Based  on 
the  information  contained  in  these  reports,  the  sampling  statisticians 
substituted  new  schools  into  the  sample  to  replace  some  of  the  non-cooper- 
ating schools  and  supplemented  the  original  san->"le  with  additional  schools 
where  needed. 

During  the  assessment  activities  the  automated  management  system  was 
expanded  to  include  the  results  of  the  actual  assessments.    Data  from  the 
School  Worksheet  on  number  of  students  to  be  assessed,  number  assessed,  and 
number  absent  were  keyed  by  se.'  ion  foi  both  spiral  and  tape  sessions.  In 
addition,  data  from  the  Roster  of  Questionnaires  on  th      airier  of  Excluded 
Student  Questionnaires  and  Teacher  Questionnaires  ^:/pected  and  shipped  to 


154 


ERIC 


177 


ETS,  and  whether  the  School  Characteristics  and  Policy  Questionnrire  was 
shipped  to  ETS  were  also  keyed  into  the  receipt  system. 

A  response  rate  '^port  was  generated  weekly  which  allowed  the  project 
staff  to  monitor  the  progress  of  the  assessments  by  checking  both  that  the 
schools  were  assessed  on  schedule  and  that  a  high  response  rate  was 
achieved.  Tha  sampling  statisticians  used  these  reports  to  monitor  the 
sample  yield  by  school,  PSU  and  grade/age. 

Another  method  u?ed  to  monitor  the  progress  of  the  assessments  was  the 
twice-weekly  telephone  report  between  the  field  director  or  assistant  field 
director  and  each  supervisor.    During  these  phone  conversations,  the  field 
director  and  assistant  field  director  reviewed  the  supervisor's  schedule  as 
well  as  any  problems  the  supervisor  was  experiencing. 

The  district  supervisors  were  required  to  complete  a  monthly  calendar 
which  indicated  when  each  school  was  being  sampled  and  assessed.  This 
calendar  served  two  purposes.    First,  it  allowed  the  field  director  and 
assistant  field  director  to  review  the  supervisor's     hedule  and  the 
distribution  of  work  for  the  month.    Second,  it  enabled  the  field  director 
and  assistant  field  director  to  locate  the  supervisor  when  urgent  messages 
had  to  be  relayed. 

In  addition  to  the  monthly  calendars,  the  supervisors  completed  a 
Weekly  Status  Report.    This  report  indicated  which  schools  had  been  called 
to  confirm  the  assessment  date(s)  as  well  as  the  dates  that  the  sample  was 
drawn,  assessment  completed,  school  shipment  mailed  to  ETS,  and  when 
exercise  administrators  were  observed  for  each  school.    Information  from 
this  report  was  reviewed  with  the  supervisor  during  the  twice-weekly 
telephone  call. 


7.4.2    Materials  and  Reporting  Forms  for  Assessment  Activities 

At  the  second  training  session  in  October  (prior  to  the  start  of 
assessment  activities),  Westat  provided  the  supers^isors  with  the  reporting 
forms  and  supplies  needed  for  the  assessment  phase,  as  well  as  the  Session 
Assignment  Forms  which  were  used  to  sample  students  for  the  assessment  and 
sample  teachers  for  the  teacher  survey.    ETS  also  provided  materials  and 
supplies  necessary  for  the  conduct  of  the  assessment  and  shipped  them  to 
the  supervisors'  homes.    At  the  start  of  each  grade/age  assessment, 
additional  materials  were  shipped  to  replenish  supplies. 

Supplies  provided  by  Westat  included  No.  2  pencils,  pencil  sharpeners, 
appc'  *^ment  cards,  timers,  tape  recorders,  additional  supplies  of  parental 
consent  letters  and  NAEP  brochures,  and  Dali  postcards  for  assessment  (the 
postcard  was  placed  inside  specific  test  booklets  and  was  used  by  the 
students  assigned  these  booklets  to  complete  some  of  the  test  items). 

During  training,  the  supervisors  were  given  a  binder  with  a  Session 
Assignment  Form  for  each  sampled  school  in  the  Grade  8/Age  13  sample  (Grade 
8/Age  13  assessments  began  the  week  after  training).    The  session 


155 


178 


Assignment  Form  provided  the  supervisor  with  the  number  of  each  type  of 
session  (spiral  and  tape)  to  be  held  in  the  school  as  well  as  the  line 
numbers  that  designated  the  students  selected  for  assessment  from  those 
listed  on  the  Student  Listing  Form.    In  addition,  line  numbers  were 
provided  for  use  in  selecting  teachers  for  the  teacher  survey.  Session 
Assignment  Forms  for  the  other  two  grade/age  assessments  were  sent  several 
weeks  prior  to  the  start  of  each  grade/age  assessment. 

When  the  student  sample  selection  had  been  completed  and  the  selected 
students  had  been  assigned  to  sessions  on  the  Student  Listing  Form,  the 
supervisor  completed  an  Administration  Schedule  for  each  session  to  be  held 
in  the  school.    These  Administration  Schedules  served  as  i^udent  rosters  to 
be  used  by  the  school  coordinators  and  exercise  administrators  to  carry  out 
the  sessions. 

On  each  Administration  Schedule,  the  supervisor  entered  the  day,  date, 
time,  and  location  of  the  session,  type  of  session  (spiral  or  tape)  and  the 
name  and  ID  number  of  the  exercise  administrator  conducting  tho  session. 
The  supervisor  then  carefully  transferred  the  name,  homeroom,  birthdate, 
grade,  and  sex  of  each  student  assigned  to  that  session  from  the  Student 
Listing  Form.    The  Administration  Schedule  used  in  the  Grad^  4/Age  9  and 
Grade  8/Age  13  schools  had  two  copies  and  the  schedule  used  in  Grade  11/Age 
17  schools  had  three  copies.    The  additional  copy  was  added  for  the  Grade 
11/Age  17  schools  since  the  second  copy  (which  was  retained  by  the  school) 
was  usually  giv<?n  to  the  school  coordinator  prior  to  the  start  of  a  session 
and  therefore  did  not  have  booklet  numbers  recorded  on  it.     (The  school 
needed  to  retain  a  copy  of  the  Administration  Schedule  with  student  booklet 
numbers  in  order  to  participate  in  a  lollow-up  language  study.) 

The  supervisor  gave  the  top  copy  (or  top  two  copies  for  the  Grade 
11/Age  17  assessment)  of  the  Administration  Schedule  to  the  exercise 
administrator  who  was  to  conduct  the  session.    The  exercise  administrator 
used  this  copy  during  the  session  to  check  attendance,  observe 
race/ethnicity  and  record  the  student  identification  number  from  the 
assessment  booklet.    After  the  session,  the  exercise  administrator  reported 
the  results  of  the  session  by  entering  the  number  of  students  assessed, 
number  of  students  abt-nt,  bundle  numbers  from  which  booklets  were  used, 
and  number  of  used  and  unused  booklets.    The  exercise  administrator  then 
tore  off  the  top  copy  at  the  perforation  (between  ^'Homeroom"  and 
"Birthdate").    The  portion  of  the  top  copy  of  the  form  containing  the  names 
was  left  with  the  school  coordinator  (who  also  retained  the  second  copy  of 
the  entire  schedule)  and  the  tear-off  ^ Drtion  without  the  names  was  given 
to  the  district  supervisor.    The  distinct  supervisor  and  exercise 
administrators  used  the  tear-off  portion  to  code  the  front  cover  of  the 
test  booklets  and  mailed  this  tear-off  portion       ETS  with  the  test 
booklets  and  other  reporting  forms. 

The  Roster  of  Questionnaires  served  as  an  important  recordkeeping  and 
shipping  document  for  Excluded  Student  Questionnaires  and  Teacher 
Questionnaires.    This  form  was  printed  on  three-part  paper  and  completed  by 
the  district  supervisor.    One  copy  of  the  form  wac  sent  to  ETS,  one  copy 
was  sent  to  Westat,  and  one  copy  was  retained  by  the  supervisor. 

156 


ERIC  n\i 


The  first  section  of  the  Roster  of  Questionnaires  contained  relevant 
information  about  the  excluded  students •    The  supervisor  recorded  the 
excluded  student's  line  number  from  the  Student  Listing  Form,  the  type  of 
session  for  which  the  excluded  student  had  been  selected,  the  ID  number 
from  the  Excluded  Student  Questionnaire,  and  information  about  shipment  of 
the  Excluded  Student  Questionnaire  to  ETS. 

The  second  section  of  the  Roster  of  Questionnaires  pertained  to  the 
teacher  survey.    Here  the  supervisor  recorded  the  line  number  from 
the  Student  Listing  Form  of  the  student  selected  for  the  teacher  survey, 
the  code  number  assigned  to  the  teacher,  the  ID  number  from  ihe  Teacher 
Questionnaire,  and  information  about  shipment  of  the  Teacher  Ques"^  ^onnaire 
\o  ETS.    Also,  at  the  bottom  of  the  Roster  was  a  place  to  indie    -a  whether 
the  School  Characteristics  and  Policy  Questionnaire  was  enclosed  with  the 
shipment  of  school  materials  to  ETS. 

If  the  district  supervisor  was  unable    o  collect  all  of  ihe  Excluded 
Student  Questionnaires,  Teacher  Questionnaires,  and/or  the  School 
Characteristics  and  Policy  Questionnaire  before  shipping  the  assessment 
materials  for  a  school  to  ETS,  he  or  she  made  follow-up  contacts  with  the 
school  to  collect  the  remaining  questionnaires.    When  remaining 
questionnaires  were  collected  and  shipped  to  ETS,  the  supervisor  completed 
a  Supplemental  J'ransmittal  Sheet  which  contained  the  ID  numbers  of  the 
quertionnaires  included  in  the  shipment  as  well  as  the  ID  numbers  of 
questionnaires  that  had  not  yet  been  transmitted.    This  form  was  also 
printed  on  three  part  paper;  one  copy  was  sent  to  ETS,  one  copy  was  sent  to 
Westat,  and  one  copy  was  retained  by  the  supervisor. 

The  Uchool  Worksheet  was  the  control  document  used  by  the  district 
supervisor  to  report  the  results  cf  the  assessment  in  a  school.    It  was 
printed  on  three-part  paper;  one  copy  was  sent  to  ETS,  one  copy  was  sent  to 
Westat,  and  one  copy  was  retained  by  the  supervisor.  ETS  used  this  document 
to  verify  the  supervisor's  shipment  of  the  school's  assessment  materials. 
Westat  used  this  document  to  enter  the  results  of  the  assessment  (number  of 
students  to  be  assessed,  number  assessed,  number  absent,  session  number, 
assessment  date  and  room  in  whicn  the  assessment  was  conducted)  into  an 
automated  management  system. 

Session  attendance  results  a.id  bundles  used  were  entered  by  session  for 
spiral  and  for  tape  on  the  Srhooi  Worksheet.    At  the  bottom  o:   the  form, 
the  response  rate  for  all  spiral  sessions  and  the  response  rate  for  each 
tape  session  was  calculated  in  order  to  determine  the  need  for  a  makeup 
session.  If  a  makeup  session  was  necessary,  the  results  of  the  makeup 
session  were  recorded  on  the  School  Worksheet  tor  Makeup  Sessions. 

An  Exercise  Observation  Sheet  was  completed  by  the  district  supervisor 
whenever  he  or  .she  or  an  exercise  administrator  noticed  a  problem  related 
to  an  exercise.    These  sheets  were  mailed  with  the  school's  assessment 
materials  to  ETS. 


157 


A  Session  Header  Form  was  completed  for  each  session  in  a  school  and 
placed  on  top  of  the  booklets  used  in  the  session  when  the  school's 
assessment  materials  were  shipped  to  ETS.    The  Session  Header  Form 
contained  the  supervisor's  name,  the  PSU  and  school  numbers,  and  the 
session  number. 

ETS  was  responsible  for  the  distribution  of  materials  relating  to  the 
actual  assessments,  such  as  test  booklets  and  questionnaires,  as  well  as 
supplies  used  for  the  shipment  of  assessment  materials.    At  the  start  of 
work  in  each  PSU,  ETS  shipped  these  materials  and  supplies  to  the  district 
supervisors. 

With  the  materials  and  supplies  for  the  first  PSU  in  a  grade/age 
assessment,  ETS  sent  two  sets  of  each  type  of  stimulus  tape  to  be  used  in 
the  tape  sessions. 

7.4.3    Materials  and  Reporting  Forms  for  Pre-Assessment 

In  August  1983,  ETS  regional  office  staff  made  the  initial  contact  with 
public  school  super-**  tendents  and  private  school  officials.    The  Results  of 
Contact  with  Superintendent/Private  School  Official  Form  was  designed  to 
document  the  results  of  these  calls.    When  a  final  determination  had  been 
made  regarding  whether  or  not  the  school  would  participate,  completed  forips 
were  mailed  directly  to  the  appropriate  district  supervisor  by  the  ETS 
regional  office.    The  form  has  two  parts — Part  I  was  completed  by  ETS  staff 
and  Part  II  was  completed  later  by  the  district  supervisor.    The  form  is 
printed  on  tour-part  paper;  one  copv  was  retainad  by  ETS,  two  copies  were 
sent  to  the  district  supervisor,  and  one  copy  was  mailed  to  Westat.  Once 
the  supervisor  received  this  form  fr^^m  ETS,  the  supervisor  proceeded  to 
call  the  school  to  arrange  for  an  introductory  meeting.    These  arrangements 
were  entered  o\\  Part  II  of  the  form. 

Using  th(   information  in  Part  II  of  the  Results  of  Contact  Form,  the 
district  supervisor  completed  the  Schedule  of  Introductory  Meetings.  On 
this  form  the  district  supervisor  entered  the  date,  time,  and  location  of 
the  introductory  meeting  ^x^d  listed  the  names  of  all  persons  asked  to 
attend.  This  form  was  mailed  to  Westat  with  one  copy  of  the  Results  of 
Contact  Form.    Using  the  Schedules  of  Introductory  Meetings,  mailing  clerks 
ac  Westat  prepared  letters  specifying  the  date,  time,  and  location  of  the 
introductory  meeting  and  mailed  them  to  each  of  the  persons  entered  on  the 
schedules. 

During  the  telephone  call  by  the  district  supervisor  to  the  public 
scuGol  superintendent  or  private  school  official  to  set  up  the  introductory 
meeting,  the  supervisor  reviewed  and  updated  his  or  her  copies  of  the  PSU 
Listing  of  Selected  Schools.    The  update  was  intended  to  uncover  new 
schools  that  may  have  opened,  sampled  schools  that  may  have  closed,  changes 
in  grade  span  or  enrollment,  or  corrections  in  the  name  and/or  address  of 
the  superintendent,  principal  or  school.    The  district  supervisor  was  given 
two  copies  of  the  PSU  Listing  of  Selected  Schools.    All  name  and  address 
changes  were  recorded  on  the  two  copies  of  this  listing  and  one  copy  of  the 

158 


181 


listing  was  mailed  to  Westat.    All  o^her  types  of  changes  (e.g.,  school 
openings  or  closings,  or  changes  in  giade  span  or  enrollment)  were  recorded 
on  the  School  Update  Form.    The  School  Update  Form  was  also  mailed  to 
Westat  and  given  to  the  sample  :>tatisticians  who  used  it  to  make 
adjustments  to  the  sample  when  necessary. 

During  the  introductory  meeting,  the  district  supervisors  collected  the 
Principal  Questionnaire.    The  supervisor  reviewed  the  information  entered 
on  the  Principal  Questionnaire  for  completeness  at  the  meeting  and  sent 
completed  Principal  Questionnaires  to  Westat. 

Prior  to  the  introductory  meeting,  the  district  supervisor  received  two 
copies  of  the  School  Control  Form,  a  computer-generated  form  containing 
summary  information  about  the  school.    The  first  section  of  the  form, 
School  Information  Provided,  supplied  th''^  supervisor  with  the  estimated 
total  eligible  students  and  the  prelimi/jary  number  of  sessions  expected  in 
the  school.    The  second  part  of  the  forn  was  completed  by  the  supervisor  at 
the  introductory  meeting.    Items  complet.?d  were  the  name  of  the  school 
coordinator,  if  one  had  been  appointed;  ':he  dates  of  the  assessment  week 
agreed  upon  with  the  principal;  how  the  school  planned  to  complete  the 
Student  Listing  Form;  and  any  other  information  learned  about  the  school's 
requirer-nts  for  conducting  the  assessment.    One  copy  of  the  School  Control 
Form  was  mailed  to  Westat;  the  second  copy  was  retained  by  the  supervisor. 


159 


Chapter  8 

MATERIALS  PROCESSING  AND  DATABASE  CREATION^ 


John  L.  Barone 
Educational  Testing  Service 


The  previous  chapter  on  field  administration  described  the  conduct  of 
the  NAEP  assessment  in  the  field  to  the  point  of  shipment  of  materials  to 
ETS.    This  chapter  details  the  receipt,  processing  and  final  disposition  of 
these  assessment  materials  at  ETS  as  they  were  transcribed  to 
computer-readable  form  and  placed  in  an  integrated  NAEP  database  to  be  used 
for  data  analysis  and  reporting.    This  database  is  now  available  to 
external  users  via  t  e  public-use  data  tapes  (PUDTs). 

The  flow  of  materials,  ':reation  of  data  files,  and  creation  of  the  NAEP 
database  are  depicted  as  an  ordered  set  of  processes  that  are  applied 
either  to  the  assessment  materials  or  to  the  transcribed  data.  The 
following  chapters  describe  each  of  these  processes  in  detail. 

The  large  volume  of  collected  data  and  the  complexity  of  the  Year  15 
NAEP  design,  with  its  spiralled  distribution  of  many  books,  required  the 
development  and  use  of  NAEP-specif ic  data  entry  and  management  systems, 
including  carefully  planned  and  well-defined  editing  and  quality  control 
procedures.    This  chapter  discusses  the  implementation  and  use  of  systems 
and  processes  that  resulted  in  data  management  procedures  that  were 
effective,  rerponsive,  and  insured  the  quality  and  integrity  of  NAEP  data. 
The  result  is  the  final  NAEP  database,  which  met  the  original  objectives  of 
integrity  and  usefulness,  and  exceeded  stringent  standards  for 
"correctness"  and  quality. 

Figure  8-1  is  a  flow  diagram  that  shows  the  conceptual  framework  of 
ordered  processes  that  were  applied  to  the  NAEP  materials  and  data  files. 
The  dabhed  line  through  the  center  of  the  figure  divides  the  outline  into 
two  sets  of  processes.  Processing  Assessment  Materials  and  Database 
Creation,  described  below. 

The  processes  represented  by  solid-lin    boxes  in  the  flow  diagram  were 
performed  at  ETS  on  the  paper  materials  or  coi?Duter  files.    The  two 
processes  enclosed  in  dashed-line  boxes  (Field  Administration  and  Derive 
Sampling  Weights)  were  performed  by  Westat  and  are  discussed  in  detail  in 
Chapters  7  and  4,  respectively. 


^Flow  diagrams  for  this  chapter  were  produced  )y  William  Van  Hassel. 


161 


Figure  8-1 
Data  Flow  Overview 


PROCESSING  riSSESSKEHT  miERIALS 


MTA  BASE  CREATION 


NAEP 
Instrantnts 
Rosttrs 


r  —  1 

I  FitU  I 
I  AdntntstMtton  | 
I  .  J 


HittrtaU 


ProftsstoMl 
Ssortnf 


Transertptton 
SysttM 


QttiUty 
Control 


Natoriils 
Stoptft 


iSaiipU  UetfMs  I 
I  Itrivation  i 
L  .  J 


162 


ERIC 


184 


8.0.x    Processing  Assessment  Materials 


The  left  side  of  F'-^ure  8-1  depicts  the  flow  of  NAEP  "paper"  materials. 
Chapter  8.1  describes  this  flow  in  detail  and  discusses  how  information 
contained  on  ti.e  field  rosters,  schedules,  and  worksheets  were  used  as 
controlling  mechanisms  for  processing  of  materials.    It  also  follows  the 
path  of  each  assessment  instrument  (student  assessment  books.  School 
Characteristics  and  Policy  Questionnaires,  Teacher  Questionnaires,  Excluded 
Studert  Questionnaires),  school  worksheets,  and  administration  schedules  as 
they  are  tracked  through  the  appropriate  processes  that  result  in  the  final 
integrated  NAEP  database. 

Thii  following  is  a  brief  description  of  each  process  involved  in 
material.*!  processing  as  shown  in  Figure  8-1.    Each  description  reft  .s  the 
reader  to  the  section(s)  or  chapter(s)  in  which  the  process  is  discussed  in 
detail. 


Field  Administration  is  the  conduct  and  moni'   ring  of  the 
NAEP  assessment  in  the  schools.    Chapter  7  discusses  this  process 
in  detail. 

Materials  Receipt  refers  to  receipt  and  processing  of 
assessment  materials  at  ETS.    Section  8.1.1  describes  the 
procedures  and  forms  that  were  used  to  check  and  verify  the 
receipt  of  documents  from  the  field.    It  also  discusses  the 
follow-up  procedures  that  were  initiated  when  discrepancies  were 
identified.    As  a  result  of  this  process,  paper  materials  were 
received  and  subsequently  batched  for  NA:;P  materials  processing 
and  data  transcription. 

Professional  Scoring  is  the  process  that  resulted  in  the 
scoring  of  the  open-ended  NAEP  reading  and  writing  items.  Chapter 
8.2  describes  the  items,  types  of  scoring  used,  scoring  operation, 
reliability  checks,  and  resolution  of  scoring  discrepancies. 
Entry  and  editing  of  this  data  are  discussed  in  Sections  8.1.4  and 


Data  Transcrip«-ion  Systems  refers  to  the  methodology  used  to 
transcribe  NAEP  materials  to  computer-readable  form.  The 
transcription  method  used  for  each  NAEP  instrument  is  discussed  in 
Chapter  8.1.    Chapter  8.3  describes  the  design,  structure,  and 
development  of  the  NAEP-specif ic  data  er  -v  system  used  to 
transcribe  most  of  the  NAEP  materials  to  computer  files;  it  also 
discusses  the  tracking  and  audit  mechanisms  that  were  built  into 
the  system  to  ensure  that  all  data  was  properly  processed  and 
accounted  for. 

yjditing  refers  to  the  ETS  procedures  that  ensured  the 
correctness  and  integrity  of  the  NAEP  data  files  by  (1)  validating 
every  field  of  NAEP  data  that  was  entered  into  computer-readable 
form,  (2)  identifying  any  invalid  or  inconsistent  values,  and  (3) 


163 


18j 


correcting  or  flagging  as  unresolvable  those  values  identified  as 
invalid  or  inconsistent.    Chapter  8.4  describes  these  procedures. 


Quality  Control  refers  to  the  ETS  procedures  that  assessed 
the  accuracy  of  the  data  transcription  and  editing  operations. 
Chapter  8.5  discusses  the  quality  control  procedures  used  in  NAEP 
and  provides  a  summary  of  the  likely  error  rates. 

Materials  Storage  refers  to  the  final  disposition  of  NAEP 
"paper"  materials  after  processing  had  been  completed.  Chapter 
8.1  discusses  materials  storage. 


8.0.2    Database  Creation 

The  right  side  of  Figure  8-1  depicts  the  evolution   of  the  integrated 
NAEP  database  from  the  transcribed  data  to  the  final  database,  available  to 
external  users  via  the  PUDTs.    Chapter  8.6  describes  the  processes  through 
which  the  database  evolved. 

The  remainder  of  this  section  contains  a  brief  description  of  each 
process  involved  in  Database  Creation  as  shown  in  the  figure.  Each 
description  also  refers  the  reader  to  the  section(s)  or  chapter(s)  in 
which  the  process  is  discussed  in  detail. 

Data  Files  refer'3  to  (1)  the  data  files  created  by  the 
ETS/NAEP  data  transcription,  editing  and  resolution  systems  and 
(2)  the  labeling  files  (discussed  in  Chapter  8.6)  that  contain 
descriptive  information  on  every  item  used  in  NAEP. 

Extract  is  the  process  discussed  in  Section  8.6.1  that 
created  data  files  containing  specific  demographic  data  fields 
from  the  ETS/NAEP  data  files.    These  data  files  were  required  by 
Westat  to  derive  sampling  weic;hts. 

Sample  Weights  Derivation  was  performed  by  Westat  and  is 
discussed  in  Chapter        This  process  produced  computer  tape  files 
containing  sampling  weights  for  every  student  and  school  assessed 
by  NAEP. 

Merge  refers  to  the  final  integration  of  NAEP  data  files  into 
the  NAEP  database.    This  process,  discussed  in  Section  8.6.2, 
merged  the  NAEP  data  files,  labeling  files,  and  the  NAEP  sanipling 
weights  into  one  inclusive  database. 

NAEP  Database  is  the  final,  integrated  NAEP  database  that 
contains  all  Year  15  NAEP  data.    This  is  the  database  that  is 
ultimately  made  available  to  external  users  via  the  PUDTs.  The 
structure  of  the  NAEP  database  is  discussed  in  Chapter  8.6;  the 
PUDTs  are  discussed  in  Chapter  8.7. 


164 


ERLC 


18G 


Chapter  8.1 
PROCESSING  ASSESSMENT  MATERIALS 


Alfred  M.  Rogers 
Norma  A.  Norris 

Educational  Testing  Service 


Chapter  7,  Field  Administration,  traced  the  progress  of  the  assessment 
booklets  and  related  documents  in  the  field  to  the  point  of  shipment  to 
ETS.  This  chapter  details  the  receipt  and  processing  of  these  assessment 
materials  at  ETS, 


8.1.1    Materials  Receipt 

It  was  the  responsibility  of  the  district  supervisor  to  complete  and 
mail  a  postcard  to  ETS  at  the  completion  of  assessment  administration  in 
each  school.  This  card  contained  the  assessed  school  identification,  the 
number  of  boxes  shipped,  and  the  mode  of  shipment.  The  receipt  of  this  card 
at  ETS  alerted  staff  to  expect  arrival  of  the  shipment  within  seven  working 
days.  If  after  seven  days  the  shipment  had  not  arrived,  ETS  notified 
Westat,  who  in  turn  initiated  a  trace  of  the  shipment.  This  tracing  process 
was  successful  in  all  cases  except  one,  in  which  the  full  sec  of  assessment 
materials  from  one  school  was  never  recovered.  Some  other  shipments  broke 
open  in  transit.    In  all,  55  booklets  were  lost  or  damaged. 

The  shipment  from  each  school  contained  the  school  worksheet; 
admi,  istration  schedule;  Questionnaire  roster;  School,  Teacher,  and 
Exci-  'ed  Student  Questionnaires;  and  assessment  booklets,  bundled  by 
session,  with  session  header  sheets.  The  format  and  content  of  these 
instruments  are  documented  in  the  chapter  on  field  administration.  The 
following  discussion  of  check-in  procedures  presumes  an  understanding  of 
information  contained  in  and  inter-relationships  among  these  instruments.. 

The  school  worksheet  contained  summary  counts  of  the  booklets  used  in 
all  assessment  sessions  in  each  school.  The  session  numbers  listed  on  the 
worksheet  were  first  checked  against  the  session  numbers  written  on  the 
session  header  sheets  enclosed  with  each  bundle  of  assessment  booklets.  The 
booklets  within  each  session  were  then  counted  and  checked  against  both  the 
count  written  on  the  session  header  sheet  ami  the  counts  of  used  and  unused 
booklets  in  the  corresponding  columns  of  the  school  worksheet.  Ail 
discrepancies  in  the  counts  were  referred  to  the  administration  schedules 
for  resolution.  The  booklet  numbers  from  the  bundle  in  question  were 
compared  against  the  listing  of  booklet  numbers  on  the  schedule.  If  the 


165 


187 


discrepancy  could  not  be  resolved  by  this  process,  Westat  was  notified,  who 
in  turn  contacted  the  appropriate  district  supervisor  for  resolution. 

The  Teacher  and  Excluded  Student  Que.stionnaires  were  then  counted  and 
compared  against  the  questionnaire  roster.  Any  discrepancy  in  the  Excluded 
Student  Questionnaire  counts  was  referred  to  Westat  and  again,  in  turn,  to 
the  district  supervisor  for  resolution.  Since  the  field  administration 
procedures  permitted  a  separate  shipment  of  teacher  and  school 
questionnaires,  any  discrepancy  in  the  Teacher  Questionnaire  rounts  alerted 
the  receiving  staff  to  expect  a  later  shipment. 

When  al^  of  the  student-related  materials  for  a  school  had  been 
received  and  checked  in,  the  assessment  schedules,  school  worksheet, 
assessment  booklets,  and  questionnaires  were  forwarded  to  the  data 
operations  coordinator  for  transcription  processing.  The  operations 
coordinator  separated  these  materials  according  to  the  appropriate  data 
entry  procedures:  the  assessment  schedules  were  accumulated  and  shipped  in 
batches  to  key  entry;  the  school  worksheet  and  assessment  session  bundles 
were  sent  directly  to  data  entry  systems;  the  Excluded  Student 
Questionnaires  were  also  batched  and  sent  to  data  entry  systems  as 
scheduling  permitted:  and  the  Teacher  and  School  questionnaires  were 
accumulated  and  held  for  data  entry  until  the  student  and  excluded  student 
instruments  were  completed.  The  remainder  of  this  section  follows  these 
instruments  through  entry,  editing,  and  quality  control  processing. 


8.1.2    Administration  Schedules 

As  described  in  Chapter  7,  the  administration  schedules  contain  the 
demographic  characteristics  of  the  students  selected  for  the  assessment. 
This  information,  which  included  the  sex,  ethnic  origin,  grade,  and  birth 
date  of  the  sampled  students,  was  used  by  Westat  in  the  deri\ation  of 
sampling  weights.  The  booklet  numbers  of  the  students  who  participated  were 
transcribed  to  the  schedule  at  the  time  of  the  assessment,  and  the 
demographic  information  was  in  turn  transcribed  to  the  front  covers  of  the 
booklets  after  the  assessment. 

The  demographics  of  the  students  who  were  sampled  but  did  not 
participate  in  the  assessment  (exclusions  and  absentees)  were  used  to 
adjust  the  sampling  weights  of  those  who  did.  The  excluded  student 
information  could  be  obtained  from  the  Excluded  Student.  Questionnaire  data, 
but  the  information  on  absentees  could  only  be  found  on  the  administration 
schedules.  It  was  imperative,  therefore,  that  this  information  be 
transcribed  to  computer-readable  media  and  combined  with  the  assessed  and 
excluded  student  data. 

The  administration  schedule  data  was  transcribed  to  computer  tape  by 
the  key  entry  systems  at  ETS.  One  record  was  generated  for  each  absent 
student  (line)  on  the  form.  The  PSU,  school,  and  session  codes  from  the  top 
of  the  form  were  repeated  for  each  student  on  the  form.  The  information 
transcribed  for  each  absent  student  included  sex,  grade,  and  birth  date. 
Thezi  data  were  ultimately  used  by  Westat  to  adjust  the  sample  weights. 


166 


188 


At  the  completion  of  entry  processing,  the  data  tape  was  copied  to  disk 
for  editing  and  quality  control  processing.  The  editing  process  consisted 
of  a  validation  program  and  an  interactive  text  editor  for  correcting 
erroneous  data.  The  validation  program  checked  that  the  demographic 
information  was  present  and  within  the  appropriate  ranges.  The  schedules 
were  used  in  this  process  to  resolve  any  errors  or  discrepancies  uncovered 
by  the  program  and  to  'spot-check"  records  for  quality  control. 

The  assessment  schedules  were  retained  by  the  operations  coordinator  in 
anticipation  of  future  questions  about  and  references  to  the  sample.  This 
proved  to  be  the  most  efficient  and  compact  means  of  retaining  the  relevant 
raw  data  since  the  schedules  for  all  three  grade/age  assessments  could  be 
contained  in  one  storage  box. 


8.1.3    School  Worksheets 

The  school  worksheets  were  forwarded  by  the  operations  coordinator  to 
entry  staff  for  processing  under  the  NAEP  data  entry  system.  This  system 
was  designed  and  developed  by  ETS  staff  to  meet  the  singular  requirements 
of  the  NAEP  Year  15  design,  and  is  more  fully  described  in  Chapter  8.3. 

Each  column  of  the  school  worksheet  contained  information  pertaining  to 
the  administration  activity  of  each  session  within  a  school.  This 
information  included  the  date,  time,  and  location  of  the  administration, 
the  exercise  administrator  code,  and  the  counts  of  the  students  sampled, 
excluded,  abi;ent,  and  assessed.  These  data,  along  with  the  PSU,  school,  and 
session  codes,  were  keyed  into  the  system  by  entry  staff. 

To  enter  this  information,  entry  staff  had  to  first  log  on  to  the 
computer  and  start  th^^  data  entry  program.  The  program  prompted  for  the 
operator's  initials,  which  would  be  used  in  subsequent  reporting  of  entry 
processing  activity.  The  operator  was  then  presented  with  a  primary  menu, 
requesting  input  of  the  codes  for  the  instrument  to  be  processed  and  the 
processing  mode.  The  codes  and  their  associated  actions  were  listed  below 
the  corresponding  entry  tic*lds.  The  operator  typed  in  the  codes  for  "School 
Worksheet"  processing  under  "Entry"  mode  and  pressed  the  ENTER  key.  A 
second  screen  appeared,  requesting  input  of  the  PSU  and  school  codes,  and 
the  number  of  spiral  and  tape  sessions  to  be  entered  for  that  school.  The 
operator  keyed  in  these  values  and  pressed  ENTER  again.  The  program  then 
presented  one  entry  screen  for  each  session  to  be  entered,  automatically 
assigning  the  session  code  for  spiral  sessions  and  requesting  the  booklet 
number  for  the  tape  session  code.  The  operator  then  keyed  in  each  column  of 
information  from  the  worksheet  and  pressed  ENTER  to  proceed  to  the  next 
session.  When  all  sessions  for  a  school  had  been  entered,  the  program  would 
re-display  the  school  screen  if  there  were  more  worksheets  to  process.  If 
the  operator  had  no  more  workshe^its  to  enter,  pressing  ENTER  with  no  data 
in  the  PSU  and  school  fields  wotild  return  the  progr^im  to  the  primary  menu, 
from  which  control  could  be  passed  to  other  parts  of  the  entry  system. 


167 


The  entry  system  controlled  the  processing  of  student  data  and 
maintained  statistics  on  the  entry  activity  at  the  session  level.  This  was 
accomplished  by  means  of  a  tracking  file,  on  which  each  record  contained 
all  control  and  reporting  information  for  one  session.  The  entry  of  the 
school  worksheet  information  thus  generated  a  new  record  on  the  tracking 
file  for  each  session,  initializing  the  control  parameters.  The  system 
would  not  allow  entry  of  student  data  to  proceed  unless  the  school 
worksheet  information  had  first  been  f^ntored. 

The  operations  coordinator  was  provided  with  procedures  for 
peiicviically  monitoring  and  reporting  data  entry  activity.  These  procedures 
compared  the  counts  of  booklets  processed  at  each  stage  with  the  initial 
counts  from  the  worksheet,  and  flag  discrepancies.  This,  in  turn,  alerted 
the  coordinator  to  possible  missing  or  extra  booklets.  If  the  school 
worksheet  information  was  determined  to  be  in  error,  the  operations 
coordinator  had  the  facility  to  correct  the  tracking  file  data  to  prevent 
reoccurrence  of  the  discrepancies  in  the  activity  report. 

The  school  worksheets  were  retained  by  the  operations  coordinator  in 
anticipation  of  later  queries,  since  they  could  bo  compac'.ly  stored  and 
easily  referenced. 

8.1.^'    Student  Assessment  Instruments 

The  student  assessment  booklets  were  forwarded  directly  to  the  data 
entry  area  as  the  complete  set  of  materials  was  received  from  each  school, 
'^he  booklets  were  bundled  by  session,  with  a  session  header  sheet  attached 
to  the  top  of  each  bundle.  This  sheet  contained  the  PSU,  school,  and 
session  codes,  serving  :o  identify  each  bundle.  The  header  sheets  were 
retained  with  the  bundles  throughout  entry  processing. 


8.1.A.1    Response  Data  Entry 

The  entry  operator  initiated  student  data  entry  by  entering  the 
"Student  Data"  and  "Entry"  codes  on  the  primary  menu.  The  entry  program 
*"Hen  displayed  a  screen  requesting  input  of  the  PSU,  <^"hool,  anH  session 
codes  for  the  session  data  to  be  entered.  If  the  traci.  jg  file  information 
indicated  that  entry  processing  had  terminated  for  that  session,  the 
program  displayed  a  message  to  the  operator;  and  if  the  session  code  was 
correct,  the  problem  was  referred  to  the  operations  coordinator  for 
corrp;c  tion. 

If  entry  processing  was  permitted,  the  program  displayed  a  screen  for 
the  entry  of  student  booklet  cover  information  and  requested  the  entry  of 
the  booklet  serial  number.  If  the  booklet  number  was  incorrect,  or  a 
booklet  with  that  serial  number  had  already  beer,  entered,  processing 
stopped,  a  message  was  issued,  ^=»nd  the  operator  could  either  enter  the 
correct  serial  number  if  it  was  i'iis-key*?d,  or  set  the  booklet  aside  f^r 
resolution  by  the  operations  coordinator.  If  the  serial  number  was 
acceptable,  the  program  prompted  for  entry  of  the  block  codes  printed  on 


168 


the  blocJ'nL. r^H^'^*'         "Y'^'''  """'"er  had  been  entered..  If 

iSree  C?^h  ?hf n       been  entered  correctly  from  the  booklet  but  did  not 
of        II  P'^°g""'"'ed  codes,  program  control  was  returned  to  the  entry 

lirl  J       u  ^""^        operator  had  to  again  either  enterf  e  correct 

serial  number  or  set  the  booklet  aside  for  resolution.  corieci 

of  fSp  r^™'^'-^"^^".S'^  °^  ^he  program  promptea  for  entry 

adm  ^?c^r?-"'"^  S""""^^'  '^^'^  This  information  included  ?he 

administration  code,  exercise  administrator  code,  student's  sex,  ethnici^v 
grade,  and  birth  date,  and  the  PSU  and  school  codes.  The  entry  of  hese 

h  %  s    oT  he  s'tuden.'"'''       '''''''''  response'd"!?^  conEai  ed  in 

the  rest  of  the  student  assessment  instruments  as  well  as  the  school, 
teacher,  .  .d  excluded  student  questionnaires.  This  model  w  11  be  described 

Dro^ram'?:n'?'  '  '^'"5  ''''''  explanation  of  the 

program  functioning  is  found  in  Section  8.3.7. 

resD^Jle.'^n  luT.'°"  °^  {he  "on-scorable  open-ended  response  items,  the 

?      .     f^\i;e"'s  could  be  entered  from  the  numeric  keypad  on  the 
computer  terminal  keyboards.  For  the  multip  e-choice  items    the  oro^ran 
software  automatically  converted  the  entered  numeJ"  ZHs    nto'  hf " 
alphabetic  counterparts:  "A"  for  "1"    "B"  for  atn    tkv^^      »  . 

this  keypad  were  reserved  for  special  proces^s^ng  ^od'    i      e'hyph'  r^^a's'^^  °" 

response"?  P^^"^  indicatedTmu'l"  le 

convened  Jo  ^  lulV         "D^^  expected;  the  comma  was 

onera^or  ^"ft ion  mark  by  the  program  and  flagged  data  which  the 

nllrllV  T  immediately  and  which  needed  resolution  by  the 

operations  coordinator  or  designated  entry  staff.  Additionally,  ?Sree 
function  keys  allowed  the  operator  to  control  field  processing     he Iab  key 

control'tn         '°  Ji"^'^  °"  BACKSPACE  kly  passed  ^ 

control  to    he  previous  field;  and  the  ENTER  key  signalled  to  the  program 
the  completion  of  processing  for  a  field  or  an  entire  form.  ^  ^ 

Xr^    3*'\"'^og'^am  controlled  processing  of  the  entered  data  virtually  at  the 
tXr^MT  '  ^"^^'^^■"P^i"g        alerting  the  operator  only  when  the  daU 
values  failed  to  meet  range  validation  criteria.  If  an  invalid  data  value 
was  entered,  the  prc„ram  "locked"  on  the  problem  field,  Slsabling^he 

as?  )?Jr''^  ?  legitimate  value  was  entered.  At\he  comJJftLn  of 
tne  last  field  on  a  form,  the  operator  would  press  ENTER  and  the  oroaram 
would  scan  the  entered  data  for  blanks,  to  ensure  that  no  ?ie  ds  had  Seen 
skipped  or  otherwise  erased.  If  a  blank  was  found,  the  operator  was  alerted 
and  instructed  to  fill  in  the  problem  field.  "peraior  was  alerted 

The  open-ended  non-scorable  items  were  included  in  the  entry  orocess  in 
an  ef  ort  to  capture  all  response  data.  These  responses  were    oundJn  the 

he  entrv  oT  h"  '        characters  were  permitted  by  the  system  ?or 

iordfl^.^f-         info'^mation,  so  operators  had  to  abbreviate  or  use  key 
words  at  their  own  discretion.  Those  items  which  requested  information  on 
language  usage    country  or  state  lived  in  were  codified.  The  enUy  system 
displayed  the  possible  responses  and  their  code  values  vl^en 
these  fields  were  encountered  in  the  entry  process. 


169 


ERIC 


Upon  successful  entry  of  the  booklet  cover  information,  the  program 
displayed  entry  screens  for  each  section  of  the  current  booklet.  The  first 
screen  was  always  for  entry  of  the  common  background  information,  since 
this  was  the  first  section  in  all  student  booklets.  The  BIB  spiral  booklets 
contained  three  additional  sections;  the  UBIB  spiral  and  tape  booklets 
contained  only  two.  The  type  and  order  of  these  sections  was  completely 
controlled  by  the  booklet  number,  according  to  the  NAEP  design.  At  the 
completion  of  entry  for  the  last  section  in  each  booklet,  the  program 
re-displayed  the  booklet  cover  entry  screen  to  accept  input  for  another 
booklet.  A  blank  field  entered  for  the  booklet  serial  number  indicated  the 
end  of  entry  processing  for  that  session.  The  program  performed  session 
clean-up  and  re-displayed  the  session  header  entry  screen  in  anticipation 
of  entry  processing  for  another  session.  A  blank  field  entered  for  the  PSU 
indicated  termination  of  student  data  entry  processing  and  the  program 
returned  to  the  primary  menu. 

Several  of  the  participating  schools  conducted  all  of  their  spiral 
sessions  as  one  large  session.    Consequently,  some  session  bundles  were  too 
large  to  be  accommodated  in  one  entry  sitting.  The  program  permitted 
interruption  of  entry  and  verification  processing  to  adapt  to  the  entry 
operators'  schedules.  At  the  completion  of  entry  processing  for  a  session, 
the  operator's  initials  and  the  date  were  written  on  the  session  header 
sheet  and  the  bundle  was  placed  in  the  staging  area  for  verification 
processing. 

The  entry  mode  created  the  student  data  records  and  wrote  them  to  the 
entry  system  work  files.  The  verification  mode  was  essentially  a  second 
entry  of  the  data  and  a  blind  field-by-field  comparison  with  the  original 
data.  If,  for  any  field,  the  data  value  entered  under  verification  differed 
from  the  initial  value,  the  program  would  "lock"  on  that  field,  issue  a 
message  to  the  operator,  and  allow  the  operator  to  determine  whether  the 
value  was  mis-keyed  or  incorrectly  entered  the  first  time,  enter  the 
"correct"  value,  if  necessary,  and  press  ENTER  to  continue  processing  the 
remaining  fields.  While  the  program  was  locked  on  the  discrepancy,  the 
operator  could  press  the  question  mark  key  to  view  the  initial  data  value. 
During  verification  processing,  each  data  record  was  rewritten  to  the  work 
file  with  all  changed  data  values. 

At  the  completion  of  verification  processing  for  a  session,  the  program 
printed  an  audit  trail  listing  at  a  printer  in  the  entry  area.  This  listing 
was  a  formatted  summary  of  an  adjunct  file  to  the  work  data  file  which  was 
created  and  updated  by  the  system  during  processing  of  the  session  data.  A 
record  was  written  to  the  audit  file  whenever  the  multiple  response  code  or 
a  question  mark  was  entered  as  a  data  value  under  any  processing  mode,  or 
if,  under  verification  mode,  a  data  value  was  changed  from  its  original 
value.  Each  audit  record  contained  identification  information,  including 
the  PSU,  school,  session,  booklet  serial,  section  and  item  numbers  of  the 
data  value,  and  the  operator  code,  processing  mode,  date  and  time  of  the 
action,  as  well  as  the  old  and  new  data  values. 

This  audit  listing  was  attached  to  the  session  bundle  and  forwarded  to 
the  resolution  area.  Staff  assigned  to  resolution  processing  reviewed  the 


170 


192 


audit  listing,  checked  the  actual  responses  in  the  booklets  wherever 
question  marks  were  indicated,  determined  the  appropriate  value(s)  to  be 
coded  in  the  data  file,  and  wrote  these  new  codes  on  the  audit  listing. 

The  resolution  mode  of  the  entry  system  permitted  the  operator  to 
access  data  records,  display  the  field  values,  and  make  corrections  to 
individual  fields.  A  change  in  any  data  field  under  resolution  mode  also 
generated  a  record  for  the  audit  file,  and  the  program  produced  a  second 
audit  listing  at  the  completion  of  resolution  processing  for  each  batch. 
There  was  no  limit  to  the  number  of  times  a  session  or  data  record  could  be 
processed  under  resolution. 

On  completion  of  resolution  processing,  each  bundle  was  stored  in  a 
labeled  box  and  held  for  final  editing  and  quality  control  processing. 

The  final  editing  was  performed  after  the  entry  work  files  had  been 
spooled  into  a  master  student  data  file.  This  spooling  program  checked 
every  data  field  of  every  student  record  for  out-of-range  values  and 
question  marks.  A  listing  similar  to  the  audit  listings  for  each  session 
was  produced,  which  resolution  staff  then  used  to  identify  and  correct  the 
remaining  data  anomalies. 

The  quality  control  process  selected  a  random  sample  of  each  booklet 
type  from  the  master  student  file,  identifying  those  booklets  for 
extraction  from  the  raw  data.  The  designated  booklets  were  located,  pulled 
from  their  boxes,  and  forwarded  to  quality  control  staff.  The  responses  in 
each  booklet  were  then  compared  with  their  coded  data  values  in  the  data 
file.  The  full  details  and  results  of  the  quality  control  process  are 
presented  in  Chapter  8,5.  On  completion  of  quality  control  processing,  the 
booklets  were  returned  to  their  boxes  and  shipped  to  the  professional 
scoring  area. 


8.1.4.2    Professional  Scoring 

The  open-ended  reading  and  writing  items  were  scored  according  to  the 
procedures  described  in  Chapter  8,2.  For  their  initial  scoring,  the 
booklets  were  processed  in  the  same  order  and  session  organization  as  they 
were  received  from  data  entry  systems.  However,  scoring  procedures  required 
a  reliability  or  second  scoring  for  a  20  percent  sample  of  the  booklets. 
Accordingly,  every  fifth  booklet  in  a  batch  was  put  aside  for  this  purpose 
during  initial  scoring.  Additionally,  those  booklets  containing  items  to  be 
holistically  scored  were  held  for  that  process  while  the  remainder  were 
forwarded  to  ETS  key  entry  systems. 

The  back  cover  of  each  student  booklet  contained  a  row  of  boxes  for 
each  open-ended  reading  and  writing  item  contained  in  the  booklet.  The 
boxes  were  used  by  scoring  staff  to  enter  scores  and  scorer  identification 
codes  according  to  the  scoring  specifications  for  each  item.  The  primary 
trait  scorers  entered  their  identification  codes  into  special  boxes  on  the 
front  cover  of  each  booklet.  Key  entry  staff  transcribed  the  booklet  serial 
number  and  scorer  identification  codes  from  the  front  of  each  booklet  and 


171 


the  scores  from  the  back.  Because  the  number  of  boxes  varied  from  item  to 
item  and  the  arrangement  and  number  of  items  varied  by  booklet,  the  score 
data  were  loosely  f-^rmatted  on  the  data  records,  which  were  later  untangled 
under  the  editing  process.  This  untangling  process  is  described  in  Section 
8.4.2. 

By  the  time  the  booklets  had  completed  scoring  and  key  entry 
processing,  their  session  organization  had  been  substantially  corrupted.  In 
anticipation  of  future  writing  assessments  which  would  require  re-scoring 
the  writing  items  from  Year  15,  the  booklets  were  reorganized  and  boxed  by 
booklet  number.  This  would  facilitate  the  extraction  of  specific  booklets 
from  the  raw  data.  The  booklets  were  then  shipped  to  the  ETS  data  retention 
area  for  long-term  storage. 


8.1.5  Questionnaires 

The  questionnaire  instruments  were  separated  by  type  and  accumulated  by 
the  operations  coordinator  as  they  were  received  from  mail  processing. 
These  data  were  also  transcribed  through  the  data  entry  system  but  on  a 
lower  priority  basis  than  the  student  booklets.  The  Excluded  Student 
Questionnaires  received  higher  priority  than  the  Teacher  and  School 
Questionnaires,  since  the  demographics  of  the  excluded  students  were  used 
in  deriving  the  sampling  weights  of  the  assessed  students.  Every  effort  was 
made  to  keep  the  processing  rate  of  these  instruments  in  pace  with  the 
student  data  entry,  in  order  to  have  the  two  files  completed  at  the  same 
time. 

The  Excluded  Student,  Teacher,  and  School  Questionnaires  each  had  their 
own  processing  options  on  the  primary  menu  of  the  entry  system.  The  entry 
operator  would  enter  the  appropriate  code  for  an  instrument  and  the  entry 
mode  to  initiate  processing.  The  questionnaire  entry  programs  followed  the 
same  model  as  the  student  entry  program  with  the  absence  of  a  tracking  file 
and  session  batching.  Entry,  verification,  and  resolution  modes  were 
available;  audit  reports  were  initiated  by  the  operations  coordinator. 

The  Excluded  Student  Questionnaire  entry  program  first  displayed  a 
screen  for  entry  of  the  front  cover  data.  The  operator  was  prompted  for  the 
serial  number  of  the  booklet  to  be  processed.  An  error  condition  occurred 
if  either  a  record  with  that  serial  number  was  found  under  entry  mode  or  no 
record  was  found  under  verification  or  resolution  mode.  In  either  case  the 
-'erator  was  asked  to  verify  that  the  correct  number  had  been  entered.  If 
ti      »-oblem  persisted,  it  was  referred  to  the  operations  coordinator  for 
resc         ^.  The  remaining  cover  information,  including  PSU  and  school  code, 
studeK.  ethnicity,  grade,  and  birth  date,  were  processed  as  for  the 

student  booklet  covers.  The  program  then  displayed  a  single  screen  for 
processing  the  responses  within  the  questionnaire.  When  the  operator 
pressed  ENTER  to  terminate  processing  for  that  booklet,  the  program 
re-displayed  the  cover  entry  screen,  ready  to  process  another  booklet.  A 
blank  field  entered  in  the  serial  number  field  returned  the  program  to  the 
primary  menu. 


172 


ERLC 


194 


The  Teacher  Questionnaire  entry  program  first  displayed  a  screen  for 
entry  of  the  cover  information.  It  processed  the  serial  number  in  the  same 
fashion  as  diJ  the  Excluded  Student  Questionnaire  entry  program.  The  cover 
information  only  included  the  PSU,  school,  and  teacher  codes.  As  the 
longest  questionnaire  instrument,  the  Teacher  Questionnaire  required  three 
screens  for  entry  processing  due  to  software  limitations  as  well  as  general 
appearance  and  ease  of  reading.  Completion  of  processing  for  each  booklet 
returned  the  program  to  the  cover  entry  screen,  where  the  entry  of  a  blank 
serial  number  returned  the  program  to  the  primary  menu. 

The  School  Questionnaire  entry  program  also  started  with  a  display  of 
the  cover  entry  screen.  The  only  information  requested  for  this  instrument, 
however,  was  the  PSU  and  school  code  which  also  served  as  the  booklet 
identification  number.  Entry  processing  for  the  questionnaire  information 
was  broken  across  two  screens.  Completion  of  processing  for  each  booklet 
returned  the  program  to  the  cover  entry  screen,  where  the  entry  of  a  blank 
PSU  and  school  code  returned  the  program  to  the  primary  menu. 

After  all  questionnaires  had  been  received  and  processed  through  the 
entry  system,  a  final  validation  was  performed  on  all  data  values  in  all^ 
records.  Any  data  errors  or  discrepancies  were  corrected  at  this  time  using 
the  resolution  mode  of  the  entry  system.  A  final  audit  listing  was 
generated,  recording  all  entry  activities  for  each  questionnaire. 

The  questionnaires  were  subjected  to  the  same  quality  control 
procedures  that  the  student  data  received.  The  details  of  the  sampling 
rates  and  results  are  discussed  in  Sections  8.5.2  through  8.5.4. 

At  the  completion  of  quality  control  processing,  the  questionnaires 
were  packed  into  boxes  and  shipped  to  the  ETS  data  retention  area  for 
long-term  storage. 


173 


Chapter  8.2 
PROFESSIONAL  SCORING 

Anne  Campbell 
Educational  Testing  Service 


The  professional  scoring  of  the  Year  15  NAEP  assessment  was  conducted 
for  open-ended  reading  and  writing  items  from  all  three  grade/ages-  Three 
methods  of  scoring  were  used:    primary  trait  scoring  for  both  writing  and 
reading  items,  and  holistic  and  mechanics  scoring  for  writing  items. 

Although  NAEP  now  scores  writing  responses  mainly  using  the  primary 
trait  system,  NAEP  used  holistic  scoring  for  its  first  writing  assessment 
in  1969.    Holistic  scoring  evaluates  responses  on  the  basis  of  overall 
impression  rather  than  on  particular  aspects  such  as  mechanics  or 
organization.    As  a  relative  procesj  dependent  upon  the  quality  of  writing 
received,  holistic  scoring  did  not  completely  address  the  need  to  report 
performance  levels  for  particular  writing  skills  or  the  need  for  a  scoring 
system  that  could  be  replicated.    As  a  result,  NAEP  began  to  search  for  an 
alternative  scoring  process  to  use  in  the  next  writing  assessment. 

With  input  from  educators  and  measurement  specialists,  NAEP  devised  a 
system  known  as  primary  trait  scoring.    This  system  was  designed  to 
evaluate  the  ability  to  write  for  precisely  defined  purposes  and  thus  uses 
closely  defined  tasks.    When  a  writing  item  is  developed,  a  dominant 
characteristic  or  primary  trait  is  identified.    This  primary  trait  is  the 
basis  for  establishing  criteria  for  evaluating  the  responses.  These 
criteria  are  associated  with  specific  score  points  in  a  scoring  guide. 
Each  score  point  defines  a  level  of  task  accomplishment,  that  is,  the 
degree  to  which  a  response  contains  the  characteristics  required  to 
accomplish  the  purpose  of  the  writing  task. 

Although  the  primary  trait  system  was  developed  specifically  to 
evaluate  responses  to  writing  tasks,  the  scoring  approach  was  adapted  to 
evaluate  responses  to  open-ending  reading  items.    Criteria  were  defined  to 
evaluate  how  well  students  responded  to  a  reading  passage  when  asked  to 
perform  such  tasks  as  evaluating  a  story  or  poem,  identifying  and 
supporting  a  mood  of  a  passage  or  using  information  in  a  passage  to  draw 
comparisons  and  contrasts.    Criteria  for  each  task  were  associated  with 
specific  score  points  in  a  scoring  guide. 

Two  distinctions  may  be  made  about  the  items  which  were  scored.  First, 
all  of  the  open-ended  items  were  incorporated  into  booklets  on  the  basis  of 
the  spiralling  design.    However,  a  few  items  were  also  used  in  booklets 


175 


which  were  accompanied  by  paced  audio  tapes  (see  Chapter  5  for  a  discussion 
of  spiral  and  tape  administration.)    In  scoring  the  spiral  and  the  tape 
booklets,  no  distinction  was  made  between  the  two;  the  tape  booklets  were 
included  with  the  spiral  booklets  in  the  batching  process. 

Second,  to  provide  for  trend  analysis,  four  writing  items  were  included 
in  the  Year  15  assessment  which  had  been  administered  in  previous 
assessments.    Three  of  these  items  were  from  the  Year  10  (1978-79) 
assessment;  the  fourth  item  had  been  used  in  both  the  Year  10  and  the  Year 
5  (1973-74)  assessments.    Responses  to  these  four  items  from  the  previous 
assessment  years  were  not  scored  at  the  time  they  were  collected,  but  were 
retained  so  that  they  could  be  scored  at  the  same  time  and  by  the  same 
scorers  as  the  responses  from  the  Year  15  assessment.    Thus,  when  the 
scoring  for  Year  15  began,  the  responses  from  the  previous  assessment  were 
intermingled  and  scored  with  those  from  the  Year  15  assessment.  These 
items  were  scored  using  both  primary  trait  and  holistic  methods;  a 
subsample  of  one  of  the  items  was  scored  for  mechanics  (described  below). 

Four  reading  items  which  had  been  administered  in  two  previous 
assessments  were  also  included  in  the  Year  15  assessment  to  Provide  for 
trend  analysis.    The  scoring  of  these  items  was  hanaled  in  a  different 
mannar  than  the  writing  trend  items.    Responses  to  these  items  from  the  two 
previous  assessment  years  were  scored  at  the  time  they  were  collected  and 
were  then  retained.    When  it  came  time  to  score  the  Year  15  responses  to 
the  same  items,  training  papers  from  the  previous  assessments  were  provided 
for  the  readers  to  familiarize  them  with  how  the  items  were  previously 
scored.    Then  a  20  percent  subsample  of  the  responses  from  the  previous 
years  was  pulled,  their  .scores  were  masked,  and  the  responses  were 
distributed  to  the  readers  who  re-scored  them.    The  previous  scores  were 
then  unmasked  and  compared  with  the  scores  given  by  the  current  reader.  If 
the  scores  of  the  Year  15  readers  deviated  drastically  from  the  scores 
given  previously,  special  training  sessions  were  held  to  bring  the  readers 
into  conformity  with  the  previous  scoring.    During  the  time  that  this 
re-scoring  was  going  on,  the  Year  15  responses  to  these  items  were  also 
being  scored. 

The  Year  15  NAEP  assessment  included  the  35  open-ended  items  listed  in 
Table  8  2(1).    This  table  provides  an  overview  of  reading  and  writing 
items,  including  item  number,  grade/age  level,  and  primary  trait  score 
ranges. 

The  rest  of  this  chapter  will  describe  the  different  methods  of  scoring 
and  will  discuss  the  scoring  operation,  including  training,  work  flow,  and 
reliabili "y. 


176 


197 


8.2.1    Description  of  the  Scoring 


8.2.1.1    Primary  Trait  Scoring 

All  open-ended  reading  and  writing  tasks  were  scored  using  the  primary 
trait  system  of  scoring.    This  involved  assigning  a  score  point  based  on  a 
scoring  guide  designed  for  each  item.    The  typical  guide  included  score 
points  of  0  to  4,  7,  8,  and  9,  although  a  few  had  score  points  of  0  to  3,  0 
to  5,  or  0  to  6,  plus  7,  8,  and  9.    A  general  explanation  of  these  score 
points  is  given  below. 

0,  7,  8,  9:    These  scores  were  given  to  responses  that  were 
blank,  indecipherable,  off  task,  or  contained  a  statement 
to  the  effect  that  the  student  did  not  know  how  to  do  the 
task. 

1:    This  score  indicated  an  unsatisfactory  response  in  that 
it  was  very  abbreviated,  circular,  or  disjointed  and  did 
not  represent  a  basic,  attempt  toward  addressing  the 
writing  task. 

2:    This  score  was  given  to  responses  in  which  some  or  all 
the  elements  needed  to  complete  the  task  were  present  but 
were  not  managed  well  enough  to  ensure  that  the  purpose 
of  the  task  would  be  achieved. 

3:    The  responses  given  this  score  point  included  the 
information  and  ideas  critical  to  accomplishing  the 
underlying  task  and  were  considered  likely  to  achieve  the 
desired  purpose. 

4:    This  score  was  given  to  responses  that  went  beyond  the 
essential  by  providing  more  detail  and  being  more 
coherent. 

Along  with  scoring  for  the  primary  trait,  some  tasks  also  required  the 
scoring  of  anywhere  from  one  to  four  secondary  traits  (see  Table  8.2(1)  for 
tasks  scored  for  secondary  traits).    The  scoring  of  the  secondary  traits 
involved  indicating  the  presence  or  absence  of  elements  that  were  of 
special  significance  to  that  particular  item.    For  writing  items  these 
secondary  traits  included  whether  or  not  notes  were  made  before  writing  and 
whether  or  not  critical  information  was  filled  in  on  a  form.    For  the 
reading  items,  scoring  for  the  secondary  trait  involved  analyzing  whether 
supporting  evidence  was  based  on  content,  form,  or  subjective  reaction  plus 
for  some  items  indicating  the  number  of  pieces  of  evidence  that  were 
included.     Primary  and  secondary  trait  scores  for  all  items  in  a  booklet 
were  placed  in  designated  boxes  on  the  booklet's  back  cover. 


177 

ERLC 


Table  8.2(1) 
Distribution  of  Reading  and  Writing  Exercises 


NAEP  Item 
Nunber 

Reading 
Writing 
(R)/(W) 

Grade/Age  I 

Ise 

Pritsary 
Trait  Score 
Ranges 

Secondary 
Traits 

Holistic 
Score 
Ranges 

Item  Name 

A/9 

8/13 

11/17 

Dali 

NOOOlOO 

V 

X 

X 

X 

0-9 

0-6 

School  Rule 

N000200 

w 

X 

X 

X 

0-4,7,8,9 

Recreation 
Opportuni  ties 

N000300 

w 

X 

X 

0-4,7,8,9 

1 

Food  on 

The  Frontier 

N000400 

w 

X 

X 

X 

0-4,7,8,9 

Dissecting 
Frogs 

N000500 

w 

X 

0-4,7,8,9 

XYZ 

N000600 

w 

X 

X 

0-3,7,8,9 

Swimming  Pool 

N000700 

w 

X 

X 

X 

0-4,7,8,9 

Pet 

N000800 

w 

X 

X 

0-4,7,8,9 

Radio  Station 

N000900 

w 

X 

X 

0-4,7,8,9 

Appleby  House 

NOOIOOO 

w 

X 

X 

X 

0-4,7,8,9 

Nuts 

N001500 

R 

X 

X 

X 

0-9 

3 

Travels  With 
Charley  11 

N001900 

R 

X 

X 

0-5,7,8,9 

4 

The  Door 

N002300 

R 

X 

X 

0-9 

^* 

Bethune 

N002800 

R 

X 

X 

X 

0-5,7,8,9 

Goods  to  Market 

N003100 

R 

X 

X 

X 

0-5, 7, 8, ^ 

Dependency 

N003700 

R 

X 

X 

X 

0-4, 7, 6, ^ 

Track  Meet 
/Javelin 

N004300 

R 

X 

X 

0-4,7,8,9 

Start  to  Work 

N004600 

R 

X 

X 

0-5,7,8,9 

Hole  In  The  Box 

N007200 

W 

X 

X 

X 

0-4,7,8,9 

0-6 

Childhood 
Memory 

N007400 

R 

X 

X 

0-5,7,8-9 

4 

Travels  With 
Charley  I 

N007500 

R 

X 

X 

0-5,7,8,9 

4 

Flashlight 

N007600 

W 

X 

X 

X 

0-4,7,8,9 

Ghost  Story 

N007700 

W 

X 

X 

X 

0-4,7,8,9 

Favorite  Music 

N007900 

W 

X 

X 

X 

0-4,7,8,9 

Split  Session 

N008000 

W 

X 

X 

0-4,7,8,9 

0-6 

Cow-Tail  Switch 

N008200 

R 

X 

X 

0-9 

3 

Mother  and  Do 

N008900 

R 

X 

0-9 

3 

Plants 

N014700 

W 

X 

0-3,7,8,9 

Spaceship 

N0141300 

w 

X 

0-4,7,8,9 

Aunt  May 

N014900 

w 

X 

0-4,7,8,9 

1 

0-6 

High  Tech  Pizza 

N015900 

R 

X 

0-4,7,8,9 

JoD  Application 

NO 19000 

W 

X 

0-4,7,8,9 

2 

Funding 
Space  Center 

N018000 

W 

X 

0-4,7,8,9 

Uncle 

N020000 

w 

X 

0-4,7,8,9 

1 

Bike  Lane 

N021000 

w 

-I         _ .  ft 

X 

0-4,7,8,9 

descriptive  information  about  the  student's  response. 


178 


ERIC 


8.2. 1.2    Holistic  Scoring 


Four  items  were  also  scored  holistically .    These  were  items  planned  for 
use  in  the  trend  analysis  and  so  included  responses  from  Years  5  and  10  as 
veil  as  from  Year  15.    The  responses  for  each  task  for  each  age  were 
randomly  mixed  together  and  rated  relative  to  each  other.    The  holistic 
scoring  was  perforiped  as  a  separate  task  from  the  primary  trait  scoring  by 
a  different  group  of  scorers.    Holistic  scorers  evaluated  each  response 
according  to  overall  impression,  then  assigned  scores  from  1  to  6  (with  a 
special  score  for  papers  that  were  blank  or  unrateable).    Holistic  scones 
were  placed  in  designated  boxes  on  the  back  covers  of  the  booklets. 

8.2.1.3    Mechanics  Scoring 

In  addition  to  primary  trait  and  holistic  scoring,  a  third  procedure, 
scoring  for  mechanics,  was  applied  to  a  subsample  of  responses  from  the 
exercise  "Hole  in  the  Box."    Five  hundred  essays  were  selected  from  each 
age  for  each  of  the  three  assessment  years  in  which  the  exercise  was 
administered.    Each  group  of  500  essays  selected  for  each  age  included 
responses  from  200  students  who  were  black  and  300  students  who  were  not. 

The  responses  were  duplicated  with  the  student  identification  number 
indicated  on  the  copy.    They  were  bundled  by  age  in  such  a  manner  that 
responses  from  the  three  assessment  years  were  randomly  mixed.  The 
mechanics  scoring  evaluated  the  elements  of  sentence  construction,  word 
choice,  spelling,  punctuation,  and  capitalization.    To  do  this,  a  reader 
wrote  symbols  in  red  ink  at  each  word  or  punctuation  mark  in  error  and  at 
the  ends  of  sentences  to  indicate  sentence  type  or  faulty  sentence 
construction. 

To  analyze  the  data  from  the  mechanics  scoring,  criteria  were  devised 
to  derive  scores  from  mechanical  scoring  codes.    The  codes  included: 

(1)  the  number  of  words  in  an  essay; 

(2)  the  number  of  sentences  in  an  essay; 

(3)  the  number  of  letters  in  a  word; 

(4)  the  number  of  "T-Units"; 

(5)  sentence  construction;  and 

(6)  punctuation. 

These  criteria  are  described  below. 

Number  of  Words  in  an  Essay.    Each  blank  space  used  in  key  entry 
of  an  essay  counted  as  one  word.    For  errors  that  occurred  when  a 
student  separated  one  word  into  two  (e.g.,  "mail  man"  for 
"mailman"),  readers  enclosed  the  error  in  brackets,   to  indicate 
that  the  two  words  should  be  counted  as  one. 

Words  which  could  not  be  deciphered  were  circled  by  readers 
and  followed  by  the  letter  "L."    The  "L"  was  keypunched  with  the 

179 


ERLC 


essay  to  indicate  that  the  circled  material  should  be  counted  as 
one  word. 


Number  of  Sentences  in  an  Essay.    Certain  mechanical  scoring  codes 
were  used  at  the  end  of  a  sentence.    After  keypunching  was 
completed  for  an  essay,  these  codes  were  tallied;  the  total 
counted  as  the  number  of  sentences  in  an  essay. 


Number  of  Letters  in  a  Word.  The  mean  length  of  the  words  used  by 
a  student  was  determined  by  dividing  the  number  of  letters  used  to 
keypunch  an  essay  by  the  number  of  words  in  an  essay. 


T-Units.    NAEP  uses  T-Units  to  assess  the  quality  of  syntax  used 
in  an  essay.    A  T-Unit  is  an  independent  clause  and  all  of  its 
modifying  words,  phrases,  and  clauses.    T-Unit  counts  were 
calculated  as  follows: 

(1)  a  simple  sentence  counted  as  1  T-Unit; 

(2)  a  complex  sentence  counted  as  1  T-Unit; 

(3)  a  compound  sentence  counted  as  2  T-Units; 

(4)  a  sentence  fragment  was  added  to  a  following 
sentence  so  that  it  became  a  clause  (constituting  a 
T-Unit)  in  the  new  sentence;  and 

(5)  a  run-on  sentence  constituted  several  T-Units, 
depending  upon  the  number  of  clauses  it  contained. 


Sentence  Construction.  To  assess  further  the  quality  of  syntax 
used  by  students,  the  following  were  calculated: 

(1)  percent  of  simple  sentences; 

(2)  percent  of  compound  sentences; 

(3)  percent  of  complex  sentences; 

(4)  percent  of  sentence  fragments;  and 

(5)  percent  of  run-on  sentences. 

To  determine  the  number  of  instances  of  faulty  sentence 
construction,  the  following  were  calculated: 

(1)  the  average  number  and  percent  of  sentences  with 
agreement  errors  (obtained  by  dividing  the  number  of 
"A"s  assigned  to  an  essay  by  a  reader  by  the  number 
of  sentences  in  that  essay.    The  letter  "A"  is  used 
to  signify  agreement  errors  in  sentence 

construr  ^on. ) 

(2)  the  number  of  errors  in  word  choice;  and 


180 


ERLC 


(3)  the  number  and  percent  of  sentences  that  weie 
considered  awkward. 


Punctuation.    Counts  were  obtained  for  the  following  errors: 

(1)  the  average  number  and  percent  of  misspelled  words; 
and 

(2)  the  average  number  of  errors  in  capitalization. 

Punctuation  errors  were  divided  into  thiee  categories: 

(1)  errors  involving  commas  and  dashes; 

(2)  errors  involving  end  marks  (periods,  question  marks, 
and  exclamation  points);  and 

(3)  errors  involving  other  forms  of  punctuation. 

and  were  calculated  by: 

(1)  errors  of  commission  for  each  of  the  three 
categories  above  and  for  overall  punctuation;  and 

(2)  errors  of  omission  for  each  of  the  three  categories 
above  and  for  overall  punctuation. 


In  addition  to  the  data  specified  above,  NAEP  obtained  a  summary  of  all 
mechanical  errors  for  "good"  and  "poor"  essays.    The  terms   'good"  and 
"poor"  refer  to  the  primary  trait  scores  assigned  to  each  e:;say. 


8.2.2    The  Scoring  Operation 


8.2.2.1  Scorers 

Fourteen  persons  were  hired  specifically  to  score  the  NAEP  reading  and 
writing  exercises  using  primary  trait  scoring.    The  same  fourteen  persons 
also  performed  the  mechanics  scoring. 

Generally,  the  persons  chosen  had  teaching  experience  ranging  from  the 
pre-school  to  the  community  college  level.    The  group  included  men  and 
women  of  various  ages  and  racial/ethnic  groups  who  had  lived  and/or  gone  to 
school  in  various  parts  of  the  country,  and  who  had  BA  and  MA  degrees  (a 
few  were  working  toward  doctoral  degrees).  The  persons  who  performed  the 
holistic  scorinv'  were  required  to  be  presently  teaching. 


181 


• 


5.2.2.2  Training:    Priniory  Trait  Scoring 

Before  the  training  of  the  scorers  began,  NAEP  staff  worked  with  the 
scoring  coordinator  and  assistant  coordinator  to  prepare  training  sets  and 
to  refine  the  scoring  guides. 

Training  began  with  the  26  items  administered  to  Grade  8/Age  13.  This 
training  involved  explaining  the  item  and  its  scoring  guide,  discussing 
responses  that  were  rt-presentative  of  the  various  score  points  in  the 
guide,  then  scoring  and  discussing  approximately  65  to  100  randomly 
selected  responses.    The  purpose  of  the  training  was  to  familiarize  the 
group  with  the  scoring  guides  and  to  reach  a  high  level  of  agreement  ar ong 
the  scorers.    After  the  group  training  was  completed,  each  scorer  scored 
the  items  in  each  of  fourteen  bundleo  of  booklets.    Their  scores  were 
recorded  and  a  follow-up  session  was  held  to  di.scuss  those  responses  for 
which  there  was  a  wide  range  of  scores.    Once  the  follow-up  session  was 
completed,  the  scoring  began.    Initial  training  was  completed  in 
approximately  one  month. 

As  a  follow-up  to  training,  notes  on  various  items  were  compiled  and 
distributed  to  the  scorers  for  their  reference.    In  addition,  short 
training  sessions  were  conducted  on  items  that  showed  low  relia^^ili ty.  The 
scoring  supervisor  consulted  with  individual  scorers  as  the  scoring 
progressed.    When  a  scores  was  judged  to  be  causing  a  discrepancy,  the 
supervisor  would  discuss  the  response  and  its  score  with  that  scorer. 

As  scoring  began  for  each  of  the  other  two  grade/age  levels,  training 
was  conducted  on  the  items  unique  to  those  levels.    The  training  wr.s  the 
same  as  that  conducted  initially  and  took  about  one  week  for  each  grade/age 
level, 

8.2.2.3  Training;    Holistic  Scoring 

;he  training  for  holistic  scoring  involved  several  steps.    First,  the 
table  leaders— all  of  whom  were  experienced  holistic  readers— surveyed  the 
pool  of  papers  from  assessments  and  selected  anchor  papers,  that  is,  papers 
repres'^ntative  of  six  levels  of  proficiency.    Then,  they  developed 
guidelines  describing  each  level  and  how  to  distinguish  between  lop-half 
and  bottom-half  papers.    The  training  began  with  some  discussion  of  the 
characterisCics  of  the  anchor  papers  and  guidelines,  then  included  several 
practice  scorings  of  other  papers  to  refine  the  scoring  scale  description 
an«l  to  resolve  discrepancies  among  readers.  When  all  readers  were 
comfortable  with  the  guidelines,  they  scored  papers  for  an  hour,  after 
which  they  discussed  additional  anchor  papers.    Throughout  the  subsequent 
scoring  there  were  periodic  discussions  of  papers  to  ensure  that  readers 
continued  to  adhere  to  the  same  standards. 


182 


ERLC 


8.2.2.4    Training;    Mechanics  Scoring 


To  prepare  for  mechanics  training,  the  scoring  coordinator  and  an 
outside  consultant  with  experience  in  mechanics  scoring  refined  the 
guidelines  and  selected  papers  to  be  used  in  training.    The  training  itself 
involved  discussing  the  guidelines  and  sample  responses  which  had  already 
been  scored.    The  scorers  then  practiced  scoring  other  papers,  and 
discussion  was  held  when  any  discrepancies  occurred.    When  readers  were 
comfortable  with  the  guidelines,  the  actual  scoring  began.  Several 
follow-up  training  sessions  were  conducted  as  problems  arose. 


8.2.2.5    Assignment  of  Vork 

For  the  primary  trait  scoring,  the  scorers  received  the  booklets  in 
batches  as  they  were  received  from  the  schools.    A  reader  scored  all 
open-ended  items  in  all  booklets  of  a  batch.    Because  of  the  spiral  design, 
a  reader  would  encounter  many,  if  not  all,  of  the  items  at  a  grade/age 
level  as  he  or  she  scored  a  batch  of  booklets.    Thus  the  reader  had 
continual  exposure  to  all  items  throughout  the  scoring.    Interspersed  among 
the  batches  of  Year  15  booklets  were  the  responses  for  several  items  from 
the  two  other  assessment  years.    The  responses  for  each  item  were  bundled 
togethei  in  groups  of  25  by  a^e  and  by  assessment  year. 

The  three  grade/age  levels  were  scored  separately,  beginning  with  Grade 
8/Age  13,  continuing  with  Grade  4/Age  9,  and  ending  with  Grade  11/Age  17. 
It  was  hypothesized  that  this  procedure  may  have  led  to  a  "batch  effect": 
that  is,  the  Grade  4/Age  9  essays  may  have  been  evaluated  as  too  high 
because,  after  reading  the  essays  written  for  Grade  8/Age  13,  scorers  may 
have  considered  the  Grade  4/AfcC  9  responses  "pretty  good  for  fourth 
graders."    Correspondingly,  Grade  11/Age  17  responses  may  have  been  rated 
too  low  because,  following  the  Grade  4/Age  9  responses,  they  may  not  have 
seemed  "that  good  fo    eleventh  graders." 

To  determine  the  effect  of  scoring  the  papers  in  batches  by  grade/age 
levels,  an  experiiPCrnt  was  performed  in  which  NAEP  written  responses  from 
all  three  grade/age  levels  were  randomly  ordered,  then  re-scored.    It  was 
dcided  that  if  batch  effects  exceeded  one-tenth  of  a  score  point  per  item, 
post  hoc  adjustments  of  the  writing  scale  values  would  be  warranted. 

The  experiment  was  based  on  responses  to  three  writing  tasks  that  were 
administered  to  all  three  grade  levels — School  Rule,  Food  on  the  Frontier, 
and  Swimming  Pool.    For  each  writing  task  at  each  grade/age  level,  a 
representative  subsample  of  156  to  174  papers  was  drawn.    Because  the 
booklets  administered  to  each  grade/age  level  were  different  colors,  the 
responses  were  photocopied,  then  reordered  using  a  randomly  selected 
permutation  of  their  sequence  numbers.    The  responses  were  then  scored  by 
two  experienced  readers.    The  data  were  analyzed,  and  it  was  concluded  that 
no  adjustment  of  writing  scale  values  was  required.    (See  Chapter  11.1  for 
more  information  on  the  batching  effect.) 


183 


The  other  two  scoring  procedures  were  performed  under  basically  the 
same  conditions.    Each  cohort  was  scored  separately  and  responses  from  the 
two  previous  assessment  years  were  mixed  in  with  those  from  Year  15. 
However,  the  holistic  scorers  received  only  those  booklets  which  had  the 
holistic  items  in  them  and  so  did  not  receive  the  booklets  in  batches  as 
they  came  from  the  schools.    For  the  mechanics  scoring,  the  readers 
received  photocopies  of  the  responses. 


8.2.2.6    Reliability  and  Resolution 

Twenty  percent  of  the  primary  trait  items  were  subject  to  a  reliability 
check,  which  entailed  a  second  reading  by  a  different  scorer.    To  prevent  a 
second  reader  from  being  influenced  by  the  first  reader's  scores,  the  first 
reader  masked  all  the  scores  in  every  fifth  booklet  in  a  batch.  These 
booklets  were  passed  along  to  a  second  reader,  who  scored  for  the  primary 
trait  only.    All  scoring  discrepancies  were  independently  resolved  by  the 
scoring  supervisors  who  assigned  a  resolution  score.    In  most  instances, 
this  score  was  the  same  as  one  of  the  given  scores.    However,  in  a  few 
cases,  neither  score  was  considered  correct  and  so  a  different  score  was 
given.    Although  the  secondary  trait  scores  were  not  subject  to  a 
reliability  check,  they  were  sometimes  adjusted  by  the  scoring  supervisor 
to  maintain  consistency  with  the  resolved  primary  trait  score.  (See 
Chapter  11.1  for  a  description  of  the  results  of  the  reliability  check.) 

Holistically  scored  items  were  also  subject  to  a  20  percent  reliability 
check.    The  scores  of  the  first  reader  were  masked  and  the  papers  were 
passed  on  to  a  second  reader.    When  discrepancies  occurred,  alternating 
high  and  low  scores  were  assigned  if  the  scores  were  one  point  apart.  That 
is,  if  the  first  occasion  of  a  discrej^^^ncy  of  one  point  was  resolved  by 
assigning  the  lower  of  che  two  scores  given  to  an  essay,  the  next  occasion 
of  a  discrepancy  of  one  point  was  resolved  by  assigning  the  higher  score. 
Discrepancies  of  two  or  more  points  were  resolved  by  the  scoring  director, 
as  in  the  case  of  the  primary  trait  scores. 

The  same  general  procedures  were  followed  for  the  mechanics  scoring: 
20  percent  of  the  responses  were  re-scored;  second  scorers  did  not  see  the 
first  scores;  and  discrepancies  were  resolved  by  the  scoring  supervisor. 


8.2.2.7    Data  Entry 

After  the  scoring  was  completed,  the  booklets  were  sent  to  data  entry, 
where  the  scorer  ID  numbers  from  the  front  cover  and  the  scores  from  the 
back  covers  were  entered.     (See  Chapter  8.3  for  details  concerning  the  data 
entry  process  and  Chapter  8. A  for  infor-^iation  concerning  edi ting  data. ) 
The  booklets  went  to  key  entry  in  batches  except  for  booklets  which  had 
items  for  holistic  scoring;  these  were  pulled  from  the  batches  and  held. 
After  holistic  scoring  was  completed,  those  booklets  were  sent  to  key 
entry. 


184 


ERIC 


205 


Chapter  8.3 
DATA  ENTRY  SYSTEM 

Alfred  M.  Rogers 
Educational  Testing  Service 


The  transcription  of  response  data  from  paper  to  machine-readable  form 
IS  one  of  the  most  important  yet  often  overlooked  aspects  of  any  research 
project.  Among  the  many  issues  to  be  considered  are  the  collection, 
delivery,  and  management  of  the  physical  data;  the  actual  machinery, 
including  hardware  and  software,  employed  for  the  transcription  process; 
the  validation  of  the  machine-resident  data;  and  the  management  of  the  data 


In  terms  of  volume  of  data  collected,  the  Year  15  NAEP  was  comparable 
to  that  of  most  administrations  of  the  large  testing  programs  at  ETS  and 
within  the  capacity  of  extant  data  transcription  technology.  However,  the 
BIB  design  and  the  spiralled  distribution  of  its  many  booklets  created  a 
complexity  beyond  the  capability  of  that  technology."  A  new  methodology  was 
developed  for  the  sole  purpose  of  transcribing  the  NAEP  data  into  computer- 
readable  form.  This  chapter  traces  the  development  and  implementation  of 
this  system  from  a  discussion  of  its  requirements,  through  a  description  of 
Its  design,  to  a  detailed  exposition  of  its  operation. 

Figure  8.3-1  is  a  schematic  diagram  representing  the  processing  flow  of 
student  assessment  materials  through  the  data  entry  system.    The  reader  may 
refer  to  this  diagram  for  clarification  of  the  relationships  among  the 
components  of  this  system. 


8.3.1    System  Requirements 

The  primary  consideration  in  the  design  of  any  data  transcription 
scheme  should  be  the  interaction  between  the  entry  operator  and  the 
machine.  An  effective  system  should  provide  direct  access  to  the  data,  a 
convenient  entry  and  editing  mechanism,  and  an  accurate  display  or 
representation  of  the  data  values.  At  the  next  level,  the  system  should 
provide  data  and  file  management  capabilities  ai.  1  error  detection  and 
correction  functions.  Finally,  a  complete  system  should  also  include  status 
reporting  and  quality  control  procedures. 

The  data  terminal  provides  two  interface  components  between  the 
operator  and  the  machine:  the  keyboard  and  the  display  device.  The 
arrangement  and  function  of  the  keys  on  the  keyboard  are  critical  to  the 


185 


Figure  8.3-1 
Student  Data  Entry  Processing 


SchDol 
Morkshtft 


Sehool  Uirieslmt 


Student 
Books 


Student  Ihtrs 
(fiitrs) 


(Resolve) 


Font 

FOFMS 

Pams 

ERIC 


186 


207 


speed  and  accuracy  with  which  data  and  commands  can  be  entered  by  an 
operator.  A  numeric  keypad  is  preferable  to  the  typewriter  keyboard  because 
it  allows  the  operator  to  use  one  stationary  hand  for  entry  while  freeing 
the  other  hand  for  page  turning  or  other  tasks.  If,  as  in  the  case  of  the 
NAEP  instruments,  the  responses  are  coded  in  alphabetic  format  rather  than 
numeric,  the  keypad  numbers  may  be  co^^verted  to  letters  through  program 
control,  rendering  the  keypad  a  more  powerful  entry  device.  Unless  the 
entry  operator  is  a  skilled  typist,  the  entry  of  alphabetic  and  special 
characters  can  slow  entry  processing  considerably  by  adding  key  search  time 
and  intermittent  hand  movement  from  keyboard  to  booklet. 

The  manner  in  which  data  are  displayed  can  also  have  significant  impact 
on  the  efficiency  of  the  entry  process.  The  most  primitive  mode  of  display 
is  the  line  editor  mode,  in  which  data  appears  as  a  continuou:  string  of 
characters,  wrapping  around  as  many  lines  of  the  display  device  as  are 
required  to  display  the  entire  record.  This  puts  a  considerable  burden  on 
the  entry  operator  to  be  able  to  identify  a  value  in  the  data  record  from 
its  location  in  the  instrument.  Even  if  the  displayed  lines  were  enhanced 
with  rulers  indicating  column  positions,  the  operator  would  be  required  to 
know  the  correspondence  between  column  position  and  item  number. 

A  more  desirable  alternative  is  the  full  screen  mode,  which  uses  panels 
or  forms  as  the  data  input  and  display  mechanism.  A  form  may  be  regarded  as 
a  template  consisting  of  protected  areas  (text)  and  unprotected  areas 
(fields).  The  unprotected  areas  of  a  form  are  the  "holes"  in  the  template 
where  data  values  may  be  written  into  or  read  by  an  application  program. 
The  protected  area  is  the  "body"  of  the  template  which  cannot  be  accessed 
by  the  application.  A  form  designed  for  the  entry  and  display  of  data  could 
have  a  separate  field  for  each  data  value,  a  description  of  each  field  in 
an  adjacent  text  area,  and  both  text  and  field  arranged  in  a  logical  order 
consistent  with  the  layout  of  the  instrument. 

Since  human  eye-hand  coordination  is  subject  to  error,  any  data  entry 
system  operated  by  humans  should  provide  three  modes  of  operation;  entry, 
verification,  and  resolution.  The  entry  mode  takes  the  operator's 
keystrokes,  validates  them  for  data  type  and  value  range,  and  creates  the 
data  record.  The  verification  mode  takes  the  operator's  keystrokes, 
validates  them  again,  and  compares  them  with  the  data  values  on  the 
previously  written  record.  It  should  notify  the  operator  of  any 
inconsistency  and  permit  over-writing  of  the  field  if  the  initial  value  is 
determined  to  be  in  error.  The  resolution  mode  displays  the  current  data 
values  and  permits  the  selection  and  correction  of  any  field. 

These  basic  requirements  are  complicated  by  the  special  demands  of  the 
NAEP  design.  The  spiral  design  combines  a  relatively  small  number  of 
subtests  or  blocks  into  a  multitude  of  booklets,  any  booklet-oriented  entry 
system  would  need  one  data  format  for  each  booklet  and  require  that  all 
booklets  of  the  same  type  be  batched  together  for  efficient  processing,  the 
spiralled  distribution  of  the  assessment  booklets  renders  this  approach 
impracticable,  if  not  impossible,  since  the  incoming  bundles  of  booklets 
would  first  have  to  be  separated  into  piles  of  like  booklets  and  entry 
processing  would  have  to  be  delayed  until  there  were  enough  booklets  in  a 


187 


2C 


pile  to  make  the  effort  productive,  the  separation  process  would  also 
disrupt  the  session  identity  of  the  booklets,  and  special  care  would  be 
required  to  insure  that  their  proper  identification. 

A  more  appropriate  entry  system  would  maintain  this  session  identity  by 
processing  the  incoming  bundles  as  complete  units.  As  each  booklet  within 
the  batch  were  presented  for  processing,  the  system  would  determine  the 
format  to  be  used  according  to  the  booklet  identification  code.  Since  there 
are  fewer  block  types  than  booklet  types,  a  more  logical,  and  economical, 
extension  of  this  concept  would  be  to  treat  the  booklet  format  as  a 
sequence  of  block  formats. 


8.3.2    Machine  Considerations 

Due  to  the  time  and  budget  constraints  between  the  awarding  of  the  naep 
grant  and  the  field  administrations,  it  was  not  possible  to  develop 
scannable  or  machine-readable  booklets  or  answer  sheets.  The  assessment 
instruments  were  marked  or  written  in  and  manually  transcribed  to  machine- 
readable  form.  Conventional  key  entry  systems,  which  process  at  the  booklet 
level,  were  ruled  out  for  the  reasons  mentioned  above.  At  the  time  of 
preparation  for  data  collection,  ets  was  installing  a  more  sophisticated 
key  entry  system  which  could  be  programmed  to  operate  at  the  block  format 
level.  However,  it  was  not  anticipated  that  the  expertise  in  using  this 
system  could  be  developed  before  entry  processing  would  begin,  nor  was  it 
known  whether  the  system  could  handle  the  number  of  block  formats. 

The  remaining  alternative  was  a  computer-based  entry  system,  or  more 
appropriately,  an  interactive  program  for  data  generation  and  management 
operating  on  a  mini-  or  mainframe  computer  system.  The  only  computers 
available  for  use  were  a  VAX  11-780  system  running  under  VMS,  and  an  IBM 
3083  system  running  under  OS/MVS-TSO. 

Both  machines  offer  very  similar  programming  environments  for  the 
development  and  implementation  of  interactive  systems:  full-screen  editors 
for  program  code  and  control  data  creation  and  modification;  assemblers, 
FORTRAN  compilers,  and  linkage  editors  for  program  construction;  forms 
management  systems  with  editors,  utilities,  and  callable  interfaces  with 
other  languages;  and  direct-access  data  storage  with  sequential,  library, 
and  indexed  data  structures.  The  data  terminals  for  both  machines  provide 
similar  environments  for  the  entry  operator:  a  full  typewriter-style 
keyboard,  numeric  keypad,  and  function  keys. 

The  IBM  Time  Sharing  Option  (TSO)  is  a  multi-user  interactive 
subsystem  with  full  capabilities  for  program  creation,  testing,  and 
implementation.  The  System  Productivity  Facility  (SPF)  subset  of  TSO  is  a 
menu-driven,  full-screen  utility  for  the  creation,  editing,  and  maintenance 
of  program  code  and  data  files.  The  Dialog  Manager  Service  (DMS)  allows  the 
program  developer  to  design  and  use  SPF-like  panels  in  full-screen 
application  programs.  The  FORTRAN  H  compiler  conforms  to  the  1966 
standards,  with  no  capability  for  processing  CHARACTER- type  data  or  dynamic 
file  allocation  as  in  the  1977  standards.  Neither  does  it  provide  any 


188 


203 


capability  for  processing  library  files  or  indexed  files,  both  of  which  are 
valid  data  structures  on  the  IBM  system. 


The  VAX  VMS  operating  system  is  designed  as  an  interactive  user 
environment  with  much  the  same  capabilities  as  TSO.  The  EDT  editor  may  be 
used  for  editing  both  program  source  and  data  files.  The  Forms  Management 
System  (FMS)  is  a  separate  product  which  provides  its  own  forms  editor, 
library  management  utilities,  and  callable  interfaces  for  full-screen 
program  development.  The  FORTRAN  compiler  conforms  to  the  1977  standards, 
with  CHARACTER- type  data  and  dynamic  file  allocation,  and  interfaces  with 
library  and  indexed  files*  Additionally,  two  other  products  available  on 
the  VAX  had  great  potential  for  higher  order  management  functions:  the 
Common  Data  Dictionary  (CDD)  which  could  store  information  about  data  files 
and  record  structures,  and  DATATRIEVE  for  the  retrieval  and  reporting  of 
information  within  the  CDD. 

From  a  management  standpoint,  the  IBM  was  preferable  to  the  VAX,  not 
only  because  the  NAEP  data  analysis  would  be  performed  on  that  machine,  but 
because  the  rates  were  more  favorable.  One  important  factor  to  be 
considered  was  utilization.  The  IBM  TSO  was  heavily  used  during  prime  shift 
hours  (8:00  a.m.  to  4:00  p.m.)  and  experienced  performance  slowdowns  during 
peak  activity  hours.  The  VAX  was  implemented  as  a  research  and  development 
tool  and  had  no  production  load  to  contend  with.  Although  it  was  this 
difference  in  utilization  which  accounted  for  the  discrepancy  in  rates,  it 
made  the  VAX  more  attractive  for  its  stability. 

From  a  programming  standpoint,  the  VAX  offered  the  most  functionality 
and  flexibility  in  developing  an  interactive  data  entry  system.  The  lack  of 
file  management  interfaces  under  the  IBM  FORTRAN  constrained  the  program 
developer  in  choice  of  file  structures  or  forced  the  development  of  new 
interfaces  or  structures. 

Ultimately,  however,  it  was  from  an  operational  standpoint  that  the  VAX 
was  chosen.  The  forms  input  and  output  functions  on  the  IBM  worked  at  the 
screen  level;  that  is,  the  calling  program  would  issue  a  reaH  command  to 
the  terminal  and  wait  until  the  operator  had  pressed  a  function  key,  at 
which  point  the  contents  of  all  data  fields  would  be  returned.  The  forms 
input  and  output  functions  on  the  VAX  worked  at  the  field  level;  the 
contents  of  each  data  field  on  the  form  could  be  accessed  and  processed 
individually,  before  passing  control  to  the  next  data  field.  This  meant 
that  data  could  be  captured  at  the  keystroke  level,  validated  by  the 
program,  and  either  continue  to  the  next  field  or  notify  the  operator  of  a 
problem.  The  IBM  program  could  only  perform  the  validation  after  all  fields 
had  been  entered,  forcing  the  operator  to  go  back  through  the  booklet  for 
any  subsequent  error  processing. 

This  field-level  access  mode  also  made  it  possible  to  process  fields  on 
a  conditional  basis.  Several  background  and  questionnaire  items  in  the 
assessment  had  a  "Specify  Other"  option  in  which  the  respondent  circled  the 
letter  preceding  the  option  and  filled  in  a  short  written  response.  With 
special  coding  the  program  could  be  instructed  to  capture  data  from  the 


189 


2l\) 


open  response  field  if  the  option  was  selected  or  to  bypass  the  field  if 
another  or  no  option  was  chosen. 

The  VAX  terminals  also  offered  a  "true"  numeric  keypad  as  opposed  to 
the  "function"  keypads  on  the  IBM.  It  would  have  been  possible  to  use 
certain  keys  on  the  typewriter  keyboard  to  emulate  numeric  input  if  the 
need  arose,  but  that,  too,  would  have  placed  additional  demands  on  the 
entry  operators. 


8*3.3    Database  Organization  and  Structure 

The  organization  and  internal  structure  of  the  data  and  control  files 
is  the  framework  around  which  the  entry  system  is  built.  A  knowledge  of  the 
structure  and  purpose  of  these  files,  as  well  as  their  relationships  to 
each  other,  is  central  to  an  understanding  of  the  entry  system. 

The  storage  of  the  transcribed  data  is  always  the  first  consideration. 
An  indexed  data  file  using  a  unique  identification  code  such  as  the  booklet 
number  as  the  access  key  is  the  most  accurate  and  direct  means  of  storing 
and  retrieving  data  records.  A  single,  albeit  large,  indexed  data  file  is 
conceptually  the  easiest  solution  to  the  needs  of  data  storage  and  access, 
but  creates  other  problems  from  a  management  perspective.  The  records 
within  an  indexed  data  file  are  stored  in  ascending  key  order.  As  records 
are  added  to  the  file,  they  are  inserted  between  existing  records  to 
preserve  the  sort  order,  and  the  pointers  to  these  records  are  updated. 
Under  the  spiral  design,  any  spiral  session  contains  a  wide  assortment  of 
booklet  numbers.  Inserting  these  records  into  an  indexed  master  file  would 
not  only  incur  additional  overhead  processing  for  reorganizing  the  data 
file,  but  disrupt,  if  not  destroy,  the  session  identity  of  the  booklets. 

An  alternative  solution  would  be  to  store  the  data  in  smaller  "batch" 
files,  borrowing  a  term  from  optical  scanning  and  key  entry  methodology. 
The  session  is  a  logical  choice  for  a  unit  of  batch  processing:  all  field 
administration  management  functions  were  done  at  the  session  level;  the 
session  sizes  are  fairly  consistent  at  about  23  to  27  booklets  each;  and 
each  session  is  uniquely  identified  by  its  PSU,  school  and  session  codes. 
In  any  case,  maintaining  session  identity  of  the  booklets  was  a  primary 
consideration.  The  data  records  within  a  batch  file  would  still  be  stored 
in  key  order,  but  once  they  were  written  to  the  fij.e,  any  subsequent 
activity  would  not  require  reorganization. 

Having  hundreds  of  batch  files  on  the  computer  necessitated  some  means 
of  keeping  track  of  each  batch  as  it  went  through  the  entry  system.  A 
tracking  file  was  designed  in  which  each  record  would  store  the  processing 
history  of  a  single  batch  data  file.  The  tracking  file  is  also  an  indexed 
file,  using  the  session  identification  code  as  the  access  key.  This 
one-to-one  correspondence  between  tracking  file  record  and  batch  data  file 
is  central  to  the  processing  and  management  functions  of  the  student  data 
entry  system. 


190 


ERIC 


211 


Because  the  verification  and  resolution  processes  are  capable  of 
altering  data  values  in  the  batch  files,  an  audit  trail  mechanism  is 
required  to  trace  the  evolution  of  the  data  through  these  procedures •  For 
that  purpose,  a  separate  audit  file  is  maintained  for  each  batch  data  file 
on  the  system.  Any  time  an  anomalous  data  value  is  detected,  or  a  data 
value  is  altered  by  either  the  verification  or  resolution  process,  a  record 
is  written  to  the  audit  file,  giving  complete  information  about  the 
booklet,  section,  and  item  number  for  the  field,  the  old  and  new  data 
values,  and  the  date  and  time  the  action  occurred.  These  files  are 
organized  as  sequential  files,  to  which  each  audit  record  is  appended 
during  the  different  stages  of  the  entry  process. 

The  data  entry  system  required  these  three  types  of  files  for  data 
storage  and  processing  control.  The  actual  operation  of  the  entry  program 
required  two  additional  files:  the  forms  library  and  the  forms  parameters 
file. 

The  forms  library  stores  all  of  the  forms  used  in  the  entry  system.  The 
forms  are  created  and  updated  using  the  FMS  editor.  The  library  is  updated 
and  maintained  using  FMS  utilities.  The  forms  are  accessed  by  the  entry 
programs  through  the  FMS  forms  driver  routines. 

The  forms  parameters  file  is  designed  as  an  adjunct  to  the  forms 
library  for  the  control  of  field  processing  within  each  form.  Each  record 
in  the  parameters  file  corresponds  to  one  field  in  one  form.  The  parameters 
file  is  organized  as  an  indexed  file  using  the  form  name  and  field  sequence 
number  as  the  access  key.  The  integration  of  the  forms  library  and  the  form 
parameters  file  will  be  elaborated  later. 


8.3.4    Program  Structure  and  Execution 


The  data  entry  system,  as  used  by  the  entry  operator,  is  a  single 
FORTRAN-written  program  with  special  subprograms  to  handle  the  various 
components.  The  program  is  initiated  by  a  single  command  from  the  entry 
operator. 

The  program's  first  task  is  to  define  its  operating  environment  for 
recording  on  the  tracking  file  and  audit  trails.  The  date,  time,  and 
terminal  address  can  be  obtained  using  system-resident  functions,  but  the 
operator  identification  must  be  requested  from  the  operator.  When  the 
operator  code  is  entered,  the  program  displays  the  primary  options  menu 
form.  This  form  contains  two  fields  to  be  filled  in  by  the  operator  and  a 
listing  of  the  options  and  their  numeric  codes  below  each  field.  The  first 
field  to  be  filled  is  the  OPTION  code,  indicating  which  instrument  is  to  be 
processed:  the  school  worksheet,  student  data.  Excluded  Student 
Questionnaire,  Teacher  Questionnaire,  or  School  Questionnaire.  The  second 
field  is  the  MODE  code,  indicating  whether  the  selected  instrument  is  to  be 
processed  for  entry,  verification,  or  resolution. 

The  definition  of  the  program  environment  continues  with  the  validation 
and  storage  of  the  entered  codes.  The  environmental  and  other  control 


191 


212 


parameters  are  stored  in  a  COMMON  data  area  for  use  by  the  other  components 
of  the  system.  The  forms  library,  form  parameters  file,  and  tracking  file 
are  then  opened  oi  readied  for  processing.  The  program  then  transfers 
control  to  onp  of  the  five  subprograms  corresponding  to  the  OPTION 
selected. 


8.3.5    Schoo?.  Worksheet  Processing 

The  school  worksheet  entry  subprogram  performs  two  functions;  it 
provides  for  the  entry  of  session  administration  information  from  the 
school  worksheet,  and  initiates  processing  of  the  data  for  that  session. 
This  program  urrs  two  forms  for  the  collection  of  data.  The  first  form 
requests  the  sc:iool  identification  code  and  the  number  of  spiral  and  tape 
sessions  administered  in  that  school.  The  total  number  of  sessions 
determines  the  number  of  times  the  second  form  is  used  for  the  entry  of 
session-specific  information.  The  data  from  each  column  of  the  school 
worksheet  is  entered  and  stored  on  a  separate  record  on  the  tracking  file. 

As  mentioned  abo^'*,  the  tracking  file  is  indexed,  using  the  school  and 
session  code  as  the  access  key.  To  insure  that  this  key  is  unique,  the 
sessions  within  each  school  are  assigned  codes  accordingly;  regular  spiral 
sessions  are  assigned  codes  from  01  to  10;  regular  tape  sessions  are 
assigned  the  booklet  number  used  in  the  session,  in  the  range  of  64  to  67; 
makeup  spiral  sessions  are  assigned  codes  11  to  15;  and  makeup  tape 
sessions  receive  the  value  of  the  booklet  number  plus  ten,  in  the  range  of 
74  to  77. 

The  remainder  of  the  tracking  record  is  initialized  to  blanks  for  the 
date  and  time  stamp  fields,  and  zeros  for  the  count  fields.  The  record  is 
then  written  to  the  tracking  file,  ready  for  the  entry  of  student  data  for 
that  session. 

The  school  worksheet  program  is  the  only  one  of  the  five  to  operate  in 
entry-only  mode.  The  verification  of  the  worksheet  information  was  not  as 
critical  as  the  possibility  that  subsequent  processing  of  the  tracking 
record  might  contaminate  the  control  field  information.  For  this  reason, 
the  operations  coordinator  was  given  the  capability  to  alter  tracking  file 
information. 


8.3.6    Student  Data  Processing 

The  student  data  entry  sub-program  is  initiated  by  selecting  option 
number  two  on  the  primary  menu.  The  first  form  requests  input  of  the 
identification  code  of  the  session  to  be  processed.  The  program  issues  a 
read  to  the  tracking  file  for  the  record  corresponding  to  that  session.  If 
the  record  is  not  present,  either  the  school  worksheet  information  has  not 
been  entered  for  the  session,  or  the  operator  has  incorrectly  entered  the 
session  code.  A  warning  message  prompts  the  operator  to  correct  the  code, 
enter  another  session  code,  or  return  to  the  primary  menu. 


192 


ERLC 


213 


If  the  tracking  record  is  found,  the  program  reads  the  control  fields 
to  determine  the  last  activity  performed  on  that  session,  and  compares  it 
with  the  processing  mode  specified  in  the  primary  menu.  If  the  current  mode 
is  equal  to  or  greater  than  the  last  activity  code,  the  operator  is  allowed 
to  continue.  For  example,  if  the  last  activity  performed  was  verification 
and  the  current  mode  is  also  verification,  the  program  assumes  that 
previous  verification  processing  was  interrupted  and  is  to  be  resumed.  If, 
on  the  other  hand,  the  current  mode  is  entry,  the  program  insists  that 
entry  has  been  completed  with  the  initiation  of  verification.  In  this 
situation,  the  operator  may  not  process  this  batch  and  must  either  return 
to  the  primary  menu  to  change  modes,  or  select  another  session  to  process 
under  the  current  mode. 

At  this  point,  if  the  current  mode  is  either  verification  or  entry,  the 
program  reads  the  vector  of  booklet  counts  from  the  appropriate  control 
area  in  the  tracking  record.  These  counts  will  be  updated  by  subsequent 
processing  and  rewritten  to  the  tracking  record  at  the  completion  of 
processing.  The  batch  data  and  audit  files  are  then  opened  for  input  and 
output  processing.  The  booklet  cover  form  is  displayed,  requesting  input  of 
the  student  ID  code  for  the  booklet  to  be  processed. 

Upon  entry  of  the  six-digit  code,  the  program  issues  a  read  to  the  data 
file  for  the  data  record  corresponding  to  that  booklet.  An  error  message  is 
issued  if  either:  the  data  record  is  found  and  the  current  mode  is  entry; 
or  the  record  is  not  found  and  the  mode  is  verification  or  resolution.  In 
either  case,  the  operator  may  correct  the  booklet  code,  enter  a  new  booklet 
code,  or  return  to  the  session  entry  form. 

To  ensure  that  the  correct  booklet  number  has  been  entered,  the 
operator  is  prompted  to  enter  the  single-letter  block  codes  printed  on  the 
booklet  cover.  The  program  will  not  proceed  unless  the  correct  block  codes 
have  been  entered,  since  these  codes  correspond  to  the  formats  to  be  used 
in  processing  that  booklet's  data  record.  By  definition,  all  of  the  booklet 
numbers  in  a  tape  session  must  correspond  to  the  session  code,  therefore  no 
block  validation  is  performed  for  tape  booklets. 

If  no  record  is  found  under  the  entry  mode,  the  program  sets  up  to 
create  a  new  record.  The  operator  is  prompted  to  enter  the  remaining  fields 
from  the  booklet  cover:  administration  code,  grade,  exercise  administrator 
code,  sex,  race,  birth  date,  and  school  code. 

If  a  record  is  found  under  the  verification  mode,  the  program  sets  up 
to  accept  input  as  if  it  we^e  in  entry  mode.  However,  as  each  field  value 
is  entered,  it  is  compared  against  its  corresponding  location  on  the  data 
record.  If  the  values  agree,  processing  continues  with  the  next  field.  If 
they  disagree,  a  warning  message  is  issued  and  the  program  "locks"  on  that 
field,  giving  the  operator  an  opportunity  to  determine  and  enter  the 
"correct"  value.  A  more  complete  explanation  of  the  verification  process  is 
given  below. 

If  a  record  is  found  under  the  resolution  mode,  the  program  displays 
the  front  cover  data  values  from  the  record  in  their  corresponding  fields 


193 


in  the  form.  The  operator  may  then  use  the  TAB  and  BACKSPACE  keys  to  move 
from  field  to  field  and  overwrite  any  field  value. 

The  operator  presses  the  ENTER  key  to  terminate  processing  of  the  front 
cover  form.  If  there  are  any  blank  or  partially  blank  fields  on  the  form, 
the  program  signals  that  entry  is  not  complete  and  the  operator  must  fill 
those  fields  with  either  valid  data  values  or  the  missing  data  code.  If  the 
form  is  complete,  the  program  prepares  to  process  the  sections  within  the 
booklet.  The  program  uses  a  control  table  organized  by  booklet  number  and 
section  number  to  determine  which  blocks  correspond  to  each  section  in  each 
booklet.  For  each  section,  control  is  passed  to  the  FORM^ENTRY  subroutine 
to  complete  processing  of  the  data  record. 

8.3.7    Forms  Processing 

The  FORM^ENTRY  routine  is  the  workhorse  of  the  data  entry  system,  and 
serves  as  the  model  for  all  other  full-screen  entry  functions.  It  receives 
from  the  calling  program  the  name  of  the  block  to  be  processed  and  a  work 
area.  In  the  entry  mode,  this  work  area  is  received  as  a  string  of  blank 
characters  and  returned  to  the  calling  program  as  a  contiguous  string  of 
entered  data  values.  In  the  verification  and  resolution  modes,  it  contains 
the  data  string  from  the  input  data  record  and  returns  the  modified  data  to 
be  written  back  to  the  data  record.  The  routine  also  returns  to  the  calling 
program  the  length  of  the  data  string. 

The  block  name  received  by  the  routine  is  a  two-character  mnemonic  code 
assigned  to  the  cognitive  and  background  item  blocks.  It  also  identifies 
which  form  to  use  to  process  the  response  data  for  that  block.  For  the 
student  data,  all  blocks  contained  few  enough  items  to  be  represented  in  a 
single  form  without  a  cluttered  or  crowded  appearance. 

The  items  within  the  block  are  arranged  in  column  order  on  the  form, 
using  three  or  four  columns  of  approximately  equal  length.  Each  item  is 
labeled  in  the  text  area  by  its  sequence  number  within  the  block,  followed 
by  a3  many  data  fields  as  there  are  possible  responses  to  that  item.  Each 
data  field  is  named  according  to  the  NAEP  number  printed  beside  its 
corresponding  item.  This  field  name  does  not  appear  in  the  displayed  form, 
but  is  used  as  an  internal  identification  code  by  thx^  fcrms  management 
system.  The  data  fields  were  "flagged"  by  an  underline  attribute  to 
distinguish  the  data  entry  and  display  areas  from  the  text  part  of  the 
form. 

An  application  program  accesses  a  field  within  a  displayed  form  only  by 
using  the  field  name.  The  application  must  therefore  "know"  the  field  names 
within  a  form,  how  they  are  to  be  processed,  and  in  what  order  they  are  to 
be  processed.  This  information  is  provided  to  the  entry  system  by  the  forms 
parameter  fjle. 

The  forms  parameter  file  contains  one  record  for  each  field  for  each 
form.  The  file  is  structured  as  an  indexed  file,  using  the  form  name  and 
sequence  number  of  the  field  within  the  block  as  the  access  key.  After 


194 


ERLC 


215 


loading  in  the  designated  form  from  the  form  library,  the  routine  locates 
and  reads  the  record  corresponding  to  the  first  field  from  the  parameter 
file  using  an  indexed  read.  The  remaining  records  are  read  sequentially 
from  that  point  until  a  record  for  another  form  or  end-of-file  is 
encountered.  The  parameters  on  these  records  are  loaded  into  an  internal 
table  which  is  used  by  the  routine  in  processing  the  data  for  this  block. 
The  contents  of  the  parameter  table  will  be  listed  here  and  their  functions 
elaborated  below:  item  number,  field  name,  alternate  form,  alternate  field, 
field  t>ie,  field  width,  number  of  valid  responses,  next  field  name, 
conditional  codes  and  conditional  field  names. 

After  the  form  and  its  parameters  have  been  loaded,  the  routine 
determines  its  processing  environment  from  the  control  parameters  in  the 
common  area.  It  displays  the  current  booklet  section  number  as  part  of  the 
form  title  and  the  processing  mode  in  the  lower  right  corner.  Entry 
processing  begins  by  setting  an  index  to  point  to  the  first  field  on  the 
form.  Since  all  fields  are  processed  in  an  'identical  manner,  it  suffices  to 
describe  the  processing  of  a  single  field. 

The  program  "reads"  a  field  by  invoking  an  FMS-supplied  routine  which 
uses  the  field  name  as  input  and  returns  the  contents  of  the  field  and  a 
field  terminator  code.  There  are  four  terminator  codes  recognized  by  the 
routine,  three  of  which  correspond  to  function  keys  on  the  keyboard:  ENTER, 
TAB,  and  BACKSPACE.  The  fourth  terminator  code,  AUTOTAB,  indicates  that  the 
field  has  been  completely  filled  by  operator  input. 

The  ENTER  code  indicates  that  the  operator  has  pressed  the  RETURN  -r 
ENTER  key  to  terminate  processing  for  the  form.  The  program  scans  the  form 
for  blank  data  fields  to  ensure  that  all  fields  have  been  processed  under 
tne  entry  and  verification  modes.  If  a  blank  is  found,  the  program  issues  a 
warning  message  and  the  operator  must  complete  the  form  to  proceed  with  the 
next  section. 

The  TAB  and  BACKSPACE  codes  indicate  that  the  operator  has  pressed 
their  corresponding  keys  to  move  ahead  one  field  or  back  one  field, 
respectively.  The  field  pointer  index  is  either  incremented  or  decremented 
and  the  next  or  previous  field  is  processed. 

If  the  AUTOTAB  code  is  returned,  the  entry  operator  has  made  one  or 
more  keystrokes  to  fill  the  requested  field.  The  field  width  parameter 
corresponds  to  the  size  of  the  field  in  the  form  and  indicates  the  number 
of  characters  returned  for  processing.  The  field  type  parameter  indicates 
how  the  returned  data  is  to  be  processed.  The  data  fields  on  oil  forms  fall 
into  one  of  four  types: 

Type  1  -  All  of  the  multiple-choice,  single-re:;ponse  items.  The 
responses  are  codeu  by  letters  rather  than  numbers.  All 
numeric  input  data  values  must  be  translated  into  their 
corresponding  letter  codes  before  being  output  to  the 
form  and  data  record.  These  fields  are  also  subject  to 
range  validation. 


195 


Type  2  -  All  numeric  data  which  may  be  checked  for  value  range* 
This  includes  the  "circle  all  that  apply"  items  and 
numeric  codes  foi  some  of  the  open-ended-response  items* 

Type  3  -  Any  numeric  data  which  can  only  be  validated  for  numeric 
type  but  not  for  range.  This  data  includes  counts  and 
percentages. 

Type  A  -  All  open-ended-reriponse  items  which  cannot  be  codified 
and  must  be  represented  in  their  raw  form.  These  fields 
are  always  eight  columns  ?n  length  and  may  contain  any 
combination  of  alphabetic,  numeric,  or  blank  characters. 

The  returned  data  value  is  first  compared  against  three  values 
designated  by  the  three  non-numeric  codes  on  the  keyp^^d    The  hyphen  is  used 
to  indicate  "no  response"  to  the  item.  This  code  is  valid  for  all  field 
types.  The  period  ind^'^ates  that  two  or  more  choices  were  selected  where 
only  one  choice  was  permitted.  This  code  is  only  valid  for  the  first  two 
item  types.  The  comma,  translated  into  a  question  mark  by  thr  program, 
indicates  a  response  which  cannot  be  resolved  by  the  entry  operator  and 
requires  coordinator  intervention.  This  code  is  valid  for  all  field  types. 

If  the  data  value  does  not  meet  the  above  criteria,  it  is  then 
processed  for  validation.  If  it  is  one  of  the  first  two  types,  it  is 
validated  for  range  according  to  the  number  of  valid  responses  parameter. 
If  it  is  a  Type  3  field,  it  is  checked  lor  numeric  only.  A  Type  4  field  has 
all  blank  characters  translated  into  underline  characters,  because  the 
end-of-form  processing  does  not  allow  blanks  in  the  returned  data  string. 
If  the  field  contains  an  invalid  data  value,  the  program  issues  a  warning 
message  at  the  bottom  of  the  screen  and  "locks"  onto  this  field.  The 
operator  must  enter  a  valid  data  value  before  the  program  wil]  continue 
processing  another  field.  Even  the  use  of  the  function  keys  is  prevented 
until  a  valid  value  is  entered. 

If  the  entry  mode  is  indicated,  the  program  writes  the  ^Hta  value  into 
the  next  available  location  in  the  work  area  and  sets  up  to  process  the 
next  field.  If  it  is  operating  under  verification,  the  program  compares  the 
entered  data  value  with  its  corresponding  location  in  the  work  area.  If  the 
values  agree,  processing  resumes  with  the  next  field.  If  they  disagree,  the 
program  issues  a  warning  lo  the  operator  and  again  locks  onto  the  field. 
The  program  will  not  release  control  of  the  field  until  the  operator 
presses  the  £NTER  key,  indicating  that  the  "correct"  value  has  been 
entered.  The  program  then  writes  the  new  value  into  th^  work  area  and 
continues  with  field  processing.  If  in  the  resolution  mode,  the  program 
over-writes  the  work  area  with  the  input  value. 

The  "next-field"  parameter  contains  the  name  of  the  next  field  to  be 
processed  after  the  current  field.        the  field  has  any  non-blank 
conditional  codes,  the  orogram  first  compares  the  entered  data  value  with 
these  codes.  If  there  is  a  match,  the  corresponding  conditional  field 
parameter  is  used  instead  of  the  next-field  parameter.  The  field  pointer 
index  is  incremented  and  the  corresponding  tield  name  is  compared  with  the 


196 


217 


next  field  name.  If  they  do  not  match,  the  field  on  the  form  and  its 
corresponding  work  area  location  are  filled  with  the  "no  response"  code  and 
the  index  incremented  ajain  until  a  match  is  found.  The  last  field  on  a 
form  is  signaled  to  the  program  by  a  next-field  parameter  code  of  "LAST", 
at  which  point  a  message  is  issued  to  the  operator  indicating  the  end  of 
the  form. 

The  alternate  panel  and  alternate-field  parameters  are  used  for  the  few 
open-ended  items  with  codeable  responses.  An  alternate  form  was  generated 
for  each  of  these  items  containing  a  listing  of  the  possible  responses  and 
their  corresponding  codes.  The  alternate-field  parameter  indicates  the 
field  name  on  the  alternate  form  to  receive  the  data  value.  On  completion 
of  processing  for  this  field,  the  original  form  is  re-displayed  with  the 
new  data  value  in  its  field. 


8. 3.8    Audit  Trail  Processing 

The  audit  file  for  each  session  contains  informat'on  on  the  processing 
history  of  selected  data  fields  within  the  session  data  file.  The  entry 
programs  and  routines  write  a  record  to  the  audit  file  when  the  following 
field  processing  conditions  have  occurred: 

(1)  Under  entry  mode,  the  multiple  response  code  was  entered. 

(2)  Under  any  mode,  the  unresolvable  code  was  entered. 

(3)  Under  verification  and  resolution  modes,  a  data  value  was 
written  to  the  work  area  which  differed  from  the  original 
value. 

Each  record  contains  an  identification  section,  consisting  of  the 
school  and  session  codes,  the  booklet  serial  number,  section  number,  and 
item  code;  a  processing  section  consisting  of  the  operational  mode,  the  old 
data  value,  if  applicable,  and  the  new  data  value;  and  an  environment 
section  including  the  date,  time,  and  operator  code. 

The  session  entry  processing  terminates  with  the  production  of  an  audit 
trail  report.  The  program  produces  a  formatted  listing  of  the  audit  file 
contents  at  a  printer  located  near  the  entry  terminals.  This  permitted  the 
operator  to  enclose  the  report  with  the  session  materials  for  later 
processing. 


8.3.9    Questionnaire  Processing 

The  Excluded  Student,  Teacher,  and  School  Questionnaire  data  entry 
functions  are  performed  by  three  separate  sub-programs,  each  invoked  by 
different  option  codes  on  the  primary  menu  form.  Since  the  processing  of 
each  instrument  has  more  similarities  than  differences,  the  entry 
procedures  for  all  three  will  be  described  in  this  section  and  differences 
will  be  noted  where  appropriate. 


197 


The  data  for  each  instrument  are  maintained  on  single,  Indexed  files 
using  the  booklet  identification  code  as  the  access  key.  The  audit  trail 
for  each  data  file  is  also  maintained  on  a  single  data  file,  which 
constrains  entry  operation  to  one  operator  at  a  time  for  each  instrument. 
The  booklet  cover  entry  form  is  first  displayed,  requesting  entry  of  the 
booklet  number.  The  program  issues  a  read  to  the  data  filt  -^r^ng  the 
booklet  number  as  access  key.  An  error  message  is  issued  if  either  the  data 
record  is  found  under  the  entry  mode  or  the  record  is  not  found  under 
verification  or  resolution  modes.  In  either  case,  the  operator  must  check 
and  enter  the  correct  identification  or  return  to  the  primary  menu  to 
change  modes. 

Front  cover  processing  continues  with  the  entry  of  the  remaining 
information.  On  the  Excluded  Student  Questionnaire,  the  grade,  sex,  athnic 
code,  birth  date,  and  PSU  and  school  code  must  be  entered.  On  the  Teacher 
Questionnaire,  only  the  PSU  and  school  code  and  the  teacher  identification 
code  are  input.  On  the  School  Questionnaire,  the  PSU  and  school  code  serves 
as  the  booklet  identification  code  so  no  other  data  fields  are  required. 
The  operator  presses  the  ENTER  key  to  terminate  front  cover  processing.  The 
program  invokes  the  FORMENTRY  subroutine  to  process  the  response  data  as 
if  the  questionnaires  were  composed  of  separate  sections.  The  Excluded 
Student  Questionnaire  has  few  enough  data  fields  to  be  contained  on  one 
form,  but  the  Teacher  and  School  Questionnaires  had  to  be  broken  across 
three  and  two  forms,  respectively. 

Audit  trail  reporting  is  not  activated  at  the  conclusion  of  processing 
for  each  instrument.  This  function  was  provided  to  the  operations 
coordinator       be  performed  on  a  periodic  basis.  The  audit  report  program 
for  each  instrument  would  first  sort  the  audit  file  by  booklet  and  item 
number  to  facilitate  the  location  of  specific  booklet  numbers  in  the 
voluminous  printed  output. 

8.3.10    Management  Functions 

The  management  and  processing  control  of  the  large  and  complex  student 
fjatabase  was  possible  through  the  establishment  and  maintenance  of  the 
tracking  file.  Each  record  on  this  file  contained  the  administration 
information  for  a  single  session,  including  absentee,  excluded  student,  and 
assessed  student  counts.  It  also  contained  the  processing  history  of  that 
session' 5?  data,  including  the  time,  date,  and  number  of  booklets  processed 
at  the  entry,  verification,  and  resolution  stages. 

Using  the  Common  Data  Dictionary  (CDD)  product  on  the  VAX,  a  domain  was 
defined  and  stored  for  the  tracking  file,  along  with  a  corresponding  record 
format,  giving  a  label  to  each  data  field  on  the  tracking  record.  Several 
procedures  were  developed  using  DATATRIEVE  and  stored  in  the  CDD  which 
accessed  the  tracking  file  and  produced  ad  hoc  processing  status  reports. 
Both  procedures  were  provided  to  the  operations  coordinator  for  producing 
these  reports. 


198 


ERiC  21b 


The  COUNTS  procedure  produced  a  summary  of  the  various  counter  fields: 
number  of  students  assessed,  number  of  booklets  entered,  number  of  booklets 
verified,  and  number  of  booklets  resolved.  The  ACTIVITY  procedure  produced 
a  more  detailed  accounting  of  the  counts  at  the  session  level,  producing 
subtotals  at  the  school  and  PSU  level.  The  processing  dates  were  also 
included  to  assist  in  the  determination  of  any  anomalies  in  the  counts. 
DATATRIEVE  was  also  used  by  the  operations  coordinator  to  make  any 
corrections  to  the  tracking  file.  A  separate  form  containing  all  of  the 
tracking  record  fields  was  developed  and  stored  in  the  forms  library.  This 
form  was  linked  to  the  file  through  the  domain  definition  under  CDD  and 
processed  via  the  DATATRIEVE  modify  command. 


8.3.11    Data  Spooling 

At  the  completion  of  entry  processing  for  the  student  database,  the 
individual  batch  data  and  audit  files  were  "spooled"  into  single,  separate 
data  files.  In  one  step,  this  consolidation  process  accomplished  three 
objectives:    performing  a  final  validation  check  on  all  data  fields  on  all 
data  records;  preparing  transfer  of  the  database  to  the  IBM  mainframe;  and 
facilitating  the  operation  of  quality  control  and  descriptive  analysis 
procedures. 

The  spooling  program  worked  from  the  tracking  file  to  ensure  the 
processing  of  all  batches.  The  resolution  flag  on  each  tracking  record  was 
checked  to  verify  that  resolution  processing  had  been  completed  for  that 
batch.  Any  unresolved  batch  was  identified  and  noted  by  the  program  and 
processing  continued  with  the  next  batch.  The  resolved  batch  data  and  audit 
files  were  opened  for  input  processing.  The  program  appended  the  session 
identification  code  to  each  input  data  record  before  writing  it  out  to  the 
spool  data  file.  The  audit  records  already  contained  session  identification 
and  were  written  to  the  spool  audit  file  as  is. 

The  spool  daia  file  is  organized  as  an  indexed  file,  using  the  session 
code  and  booklet  serial  number  as  the  access  key.  The  spool  audit  file  is  a 
sequential  file. 

An  update  program  was  made  available  to  the  operations  coordinator  for 
making  corrections  to  the  database  after  the  fact  of  entry  processing.  This 
program  operated  inwardly  and  outwardly  as  the  data  entry  program  with  the 
difference  that  the  tracking  file  was  not  used  and  the  data  resided  in  one 
large  file. 


199 


Chapter  8.4 
EDITING  DATA 

Alfred  M.  Rogers 
Educational  Testing  Service 


The  data  editing  process  is  divided  into  three  separate  steps: 
validation,  identification,  and  correction.  Validation  ensures  that  each 
data  value  in  the  computer  file  is  of  the  correct  type,  is  within  a  range 
or  set  of  ranges  of  values,  and  is  consistent  with  other  data  values.  Ail 
invalid  data  values  are  then  identified  and  located  in  the  raw  data.  The 
erroneous  data  are  then  either  corrected  or  flagged  as  unresolvable  in  the 
computer  file. 

The  errors  uncovered  by  the  editing  process  fall  into  two  types:  those 
made  by  the  respondent  (e,g, ,  choosing  two  responses  for  a  multiple  choice 
exercise  requiring  only  one  response)  and  those  made  by  data  entry.  The 
validation  process  reports  both  types  of  error  with  no  knowledge  of  their 
source.  The  identification  process  determines  the  type  of  each  error.  The 
data  entry  errors  are,  for  the  most  part,  correctable;  the  correct  value 
can  be  determined  from  an  examination  of  the  raw  data.  Errors  made  by  the 
respondent,  however,  are  difficult,  if  not  impossible,  to  correct.  If  the 
intent  of  the  respondent  cannot  be  determined,  the  error  must  remain 
unresolved,  but  be  flagged  in  some  way  to  prevent  incorrect  interpretation 
in  analysis  and  reporting  procedures. 


8.4.1    Student  and  Questionnaire  Data 

The  data  entry  system  served  as  the  first  line  of  defense  against  bad 
data.  As  described  above,  all  data  values  were  validated  for  type  and  range 
as  they  were  entered  from  the  data  terminal  keyboard.  Special  codes 
assigned  for  multiple  and  indeterminate  responses  were  recorded  and 
reported  via  the  audit  trail.  The  indeterminate  values  were  later  corrected 
under  the  resolution  process. 

At  the  completion  of  data  entry  processing,  all  of  the  batch  student 
data  files  were  "spooled"  onto  a  single  master  file  in  preparation  for 
transfer  to  the  IBM  mainframe.  A  second  validation  was  performed  during 
this  spooling  process  to  catch  errors  that  had  "slipped  through"  the  entry 
system.  An  editing  program  was  developed  for  applying  corrections  to  this 
master  file,  using  the  same  methodology  as  for  the  data  entry  program.  This 
master  file  also  served  as  the  basis  for  preliminary  descriptive  data 
analyses  and  quality  control  checks. 


201 


22i 


Although  the  questionnaire  files  did  not  need  to  be  spooled,  they 
received  the  same  secondary  validation  processing  as  the  student  data. 
Special  attention  was  given  to  the  "circle  all  that  apply"  items  to  ensure 
consistency  in  the  coding  of  responses:  if  a  respondent  circled  one  or  more 
of  the  alternatives,  those  would  be  coded  "1"  while  the  rest  would  be  coded 
"0";  if  no  alternatives  were  marked,  yet  the  respondent  had  the  opportunity 
to  reply,  all  fields  would  be  coded  "0";  if  no  alternatives  were  marked  and 
the  respondent  had  not  reached  the  item  or  was  instructed  to  skip  it,  all 
fields  would  be  coded  as  "no  response". 


8.4.2    Professionally  Scored  Items 

The  professionally  scored  items  went  through  a  separate  entry  and 
editing  process.  The  scoring  of  the  items  occurred  after  the  booklets  had 
been  procerssed  through  the  entry  system.  Since  it  was  neither  feasible  nor 
economically  prudent  to  send  the  booklets  back  through  the  entry  system  for 
just  a  few  data  values  for  each  booklet,  these  items  were  processed  by  key 
entry  systems. 

The  scores  were  entered  by  the  raters  into  specially  provided  boxes  on 
the  back  covers  of  the  booklets.  The  boxes  were  arranged  into  rows  for  each 
of  the  items  to  be  scored,  with  as  many  boxes  in  each  row  as  there  were 
scores  permitted  by  the  scoring  guide  for  that  item.  Rather  than  devise  a 
different  format  for  each  of  the  booklet  types  to  be  entered,  a 
general-purpose  format  was  implemented  by  allocating  a  maximum  number  of 
scores  for  each  item  and  a  maximum  number  of  items  per  booklet.  The  scores 
were  then  keyed  as  a  continuous  string  of  data  values  into  the  separate 
item  locations  in  each  record.  In  addition  to  the  scores,  each  record 
contained  the  student  ID  number  for  linkage  with  the  master  student  file, 
and  the  rater  ID  codes  from  inside  the  front  cover. 

To  ensure  that  the  student  ID  codes  were  keyed  accurately,  the  data 
file  received  from  key  entry  was  matched  against  the  master  file  by  the 
student  ID,  reporting  any  mis-matches  from  either  file.  The  mis-matched  ID 
codes  were  corrected  on  the  input  file  and  the  matching  program  run  again 
until  there  were  no  discrepancies. 

The  data  files  received  from  key  entry  were  "loosely"  formatted;  the 
codes  within  the  boxes  were  transcribed  as  a  continuous  string  for  as  many 
rows  as  there  were  items  in  each  booklet.  Any  processing  scheme  must  use 
the  booklet  number  within  the  student  ID  code  to  determine  which  items  are 
in  each  booklet  and  the  location  and  number  of  data  fields  to  be  processed 
for  each  item. 

The  validation  program  checked  all  the  fields  on  each  record  for  data 
type,  range  of  values,  and  logical  consistency  with  other  fields.  The  rater 
ID  fields  were  the  first  to  be  processed.  The  values  of  the  ID  codes  were 
checked  against  a  list  of  valid  ID  codes.  The  number  of  rater  IDs  was  also 
noted  for  comparison  with  the  number  of  scores  per  item;  if,  for  any  item, 
the  number  of  scores  disagreed  with  the  number  of  raters,  either  a  score 


202 


ERIC 


value  was  missing  or  the  rater  ID  code  was  not  entered.  For  most  of  the 
booklets,  only  one  rater  performed  the  scoring.  A  20  percent  sample  of 
booklets  was  selected  throughout  the  scoring  process  and  re-scored  for 
reliability  checking.  These  booklets  would  have  two  rater  IDs.  If  for  any 
item  the  first  rater  had  disagreed  with  the  second  rater,  the  item  was 
submitted  to  a  scoring  supervisor  for  resolution  scoring.  These  booklets 
would  have  three  rater  IDs. 

The  program  would  then  refer  to  a  control  table,  indexed  by  booklet 
number,  to  determine  the  number  and  types  of  item  score  fields  to  process. 
The  scored  reading  and  writing  items  fell  into  five  basic  types:  primary 
trait  score  only;  piimary  and  secondary  trait  scores;  primary  trait  and 
holistic  scores;  primary,  secondary  and  holistic  scores;  and  one  item,  "The 
Door",  which  was  subject  to  a  mechanics  scoring  process.  Twenty  percent  of 
the  primary  trait  and  holistic  scores  were  subject  to  secondary  and 
resolution  scoring;  the  secondary  trait  items  were  scored  only  once. 
Processing  the  record  continued  on  an  item-by-item  basis. 

The  primary  trait  scores,  if  applicable,  were  first  checked  for  valid 
values  according  to  the  scoring  guide?,  then  counted  for  comparison  with  the 
number  of  rater  IDs  noted  previously.  If  there  were  more  than  one  score 
present,  the  values  of  the  first  and  second  scores  were  compared.  If  they 
disagreed,  the  program  checked  for  the  presence  of  both  a  third  score  and 
three  rater  IDs  for  the  booklet.  If  they  agreed,  only  two  rater  IDs  were 
required. 

The  secondary  trait  scores,  if  applicable,  were  then  validated 
according  to  the  scoring  guide.  The  program  referred  to  the  control  table 
mentioned  above  for  the  number  of  secondary  trait  scores  to  be  processed- 

If  the  item  was  holistically  scored,  these  scores  were  validated 
against  the  scoring  guide,  then  counted  for  rater  ID  comparison.  If  more 
than  one  score  was  present,  the  values  of  the  first  and  second  scores  were 
compared.  If  they  agreed,  only  two  scores  and  two  rater  IDs  were  required. 
If  they  disagreed  by  only  one  point,  a  third  score  was  assigned  by 
selecting  the  high  or  low  value  on  an  alternating  basis  throughout  the 
execution  of  the  program.  If  the  scores  disagreed  by  more  than  one  point, 
the  program  checked  for  the  presence  of  both  a  third  score  and  third  rater 
ID. 

The  validation  program  produced  a  printed  list  of  all  errors  and 
inconsistencies  found  in  the  score  file.  The  booklets  identified  in  this 
list  were  collected  and  checked  against  the  listing.  In  cases  where  a  value 
had  been  mis-keyed,  the  correct  value  could  be  directly  replaced  in  the 
data  file.  If,  however,  the  error  was  on  the  booklet  itself  and  accurately 
transcribed,  the  booklet  was  sent  back  to  the  scoring  supervisor  with  an 
explanation  of  the  error. 

In  either  case,  the  data  values  on  the  file  were  corrected  through  the 
execution  of  an  update  program.  This  program  used  as  input  a  "parameter 
card"  file,  each  record  of  which  indicated  the  identification  code  of  the 
data  file  record  to  be  altered,  the  field  position  within  that  record,  and 


203 


2Ki 


the  value  to  be  substituted.  This  approach  not  only  guaranteed  accurate  and 
consistent  correction  of  the  data  fields,  but  by  its  printed  output 
provided  a  document  of  all  changes  made  to  the  data  file. 

The  corrected  file  was  again  processed  by  the  validation  program  to 
ensure  that  all  errors  had  been  fixed  and  that  no  new  problems  were  created 
by  these  corrections.  If  any  more  errors  were  found,  the  cycle  of 
identifying  the  booklets,  correcting  the  errors,  and  validating  the 
corrected  file  was  repeated  until  no  more  errors  were  found.  At  this  point, 
the  score  file  was  ready  for  merging  with  the  master  student  file. 


8.4.3  Conclusion 

Before  the  NAEP  data  entry  methodology  was  developed,  the  editing 
process  for  any  data  file  proceeded  in  the  same  manner  as  for  the 
professionally  scored  items.  The  validation  process  was  especially 
inefficient  because  it  was  performed  after  the  fact  of  transcription  and 
often  by  a  second  party  who  did  not  have  immediate  access  to  the  raw  data. 
Putting  the  validation  mechanism  at  the  point  of  entry  removed  most,  if  not 
all,  of  this  inefficiency  by  informing  the  entry  operator  of  a  possible 
keying  error  while  the  raw  data  value  was  accessible.  The  interactive 
resolution  process  and  audit  trail  mechanism  also  obviated  the  need  to 
generate  parameters  for  and  run  a  generalized  updating  program  as  described 
above . 

The  editing  process  does  not  guarantee  that  all  errors  are  removed  from 
the  data;  only  that  the  invalid,  inconsistent,  or  otherwise  unreasonable 
values  have  been  at  least  identified,  if  not  corrected.  If  a  data  value  has 
been  mis-keyed  during  the  entry  process  and  meets  the  validation  criteria, 
this  error  could  persist  through  the  editing  process  to  the  analysis  stage 
without  detection.  The  verification  process  detects  most  of  these  errors  by 
comparing  independent  entries  of  the  same  data  and  reporting  discrepancies. 
The  likelihood  of  an  error  surviving  verification  is  thus  very  small,  but 
still  present.  A  quality  control  process  must  follow  the  entry  and  editing 
processes  to  ensure  that  the  data  values  in  a  given  record  agree  with  the 
responses  in  the  corresponding  instrument. 


204 


ERLC 


Chapter  8.5 
QUALITY  CONTROL 

John  vl.  Ferris 
Educational  Testing  Service 


The  purpose  of  quality  control  was  to  assess  the  accuracy  of  the  data 
entry  operation,  or  how  closely  the  contents  of  the  various  instruments 
matched  the  resulting  datasets.    Even  though  the  data  were  carefully  keyed, 
verified,  and  edited,  the  question  remains  of  how  successfully  this  was 
done. 

Whereas  the  editing  operation  assessed  the  data  itself,  the  quality 
control  operation  assessed  the  process  of  entering  the  data.  In  editing, 
data  records  were  selected  (because  inconsistencies  were  discovered  by  an 
editing  program)  and  matched  to  the  corresponding  booklet;  in  the  quality 
control  work,  the  reverse  operation  was  performed— booklets  were  selected 
and  matched  to  the  coriesponding  data  record. 

The  examination  of  data  records  in  the  .-''iting  operation  allows  us  to 
tind  some  of  the  errors  in  all  of  the  records;  the  detailed  comparison 
between  instrument  and  data  record  in  quality  control  allows  us  to  find  all 
ot  the  errors  in  some  of  the  records.    We  cannot  remove  all  errors  from 
such  a  large  and  complex  database  as  we  have  in  NAEP.    If  an  error  has  been 
made  in  key  entry  which  appears  sensible  or  reasonable  in  the  data  record, 
we  cannot  know  it  is  an  error  unless  that  instrument  happens  to  have  been 
selected  for  quality  control.    That  is  why  both  editing  and  quality  control 
are  needed.    Quality  control  allows  us  to  discover  potentially  consistent 
problems  in  data  entry  which  would  never  be  discovered  by  an  editing 
program.     It  also  allows  us  to  discover  whether  a  database  orobably 
contains  sufficiently  valid  data  to  support  the  analyses  we  wish  to  pursue. 

Random  booklets  were  selected  and  the  actual  instruments  were  compared 
keystroke  for  keystroke  with  the  datasets  created  by  the  key  entry  system 
to  discover  the  discrepancies  and  measure  the  quality  of  the  data  entry 
process.    Overall,  a  very  high  quality  was  maintained  throughout;  the 
details  are  discussed  below.    The  reader  may  wish  to  refer  to  data  layouts 
or  the  instruments  themselves  in  reviewing  these  details,  especially  when 
specific  items  are  mentioned. 


205 


8.5.1    The  Student  Data 


One  of  each  booklet  for  each  grade/age  level  was  selected  for  analysis. 
Thus,  a  total  of  67  booklets  times  three  grade/age  levels,  or  201  booklets, 
were  examined.    A  total  of  111,421  keystrokes  was  involved  in  these  201 
booklets;  only  2  keystrokes  were  in  error.    This  is  an  error  rate  of 
.000018. 

However,  since  these  results  are  affected  by  the  chance  selection  of 
booklets,  a  further  calculation  was  made.    The  probability  of  finding  two 
errors  in  a  sample  of  111,421  keystrokes  when  the  true  error  rate  is,  say, 
.0001  is 

(^^^2^S  X  .0001^  X  (l-.OOOl)^^^^^^  =  .0009 


The  corresponding  probability  of  finding  one  such  error  is  .0002;  the 
probability  of  finding  zero  such  errors  is  .00001.    These  values  must  be 
added  to  the  .0009  for  the  probability  of  finding  two  or  fewer  errors.  In 
other  words,  we  can  be  99.89  percent  sure  that  the  true  error  rate  for  the 
student  data  booklet  key  entry  operation  was  less  than  or  equal  to  .0001. 


8.5.2    The  Excluded  Student  Questionnaire  Data 

Throughout  the  entire  series  of  questionnaires  in  the  NAEP  database,  a 
recurring  problem  was  the  treatment  of  multiple  responses  made  to  questions 
designed  for  a  single  response.    An  attempt  was  made  at  an  early  stage  in 
the  data  entry  to  accommodate  these  multiple  responses  by  extending  the 
response  code  list  to  include  codes  for  the  multiple  responses  encountered. 
Inevitably,  subsequent  checking  discovered  the  need  for  still  more  of  these 
additional  codes  or  an  occasional  misuse  of  a  previously  defined  one.  The 
Excluded  Student  Questionnaire  was  no  exception  in  this  regard.    A  list  of 
these  additional  codes  may  be  found  in  the  codebooks  accompanying  the  NAEP 
1983-84  Public-Use  Data  Tape  Version  3.1  Users^  Guide  (Barone,  Norris,  & 
Rogers) . 

Excluded  Student  Questionnaires  were  randomly  sampled  at  the  rate  of 
2.5  percent,  or  one  booklet  out  of  40. 


At  Grade  4/Age  9,       58  booklets  checked  out  of  2354 

At  Grade  8/Age  13,      56  "  "  "  2078 

At  Grade  11 /Age  17,    85  "  "         "  3485 

199  7917  (2.514%) 


206 


following  discoveries  and  changes  resulted  fron  this  process: 


Grade  A/Age  9: 

Multiple  response  resolutions  were  required  for 
Question  15.    Three  new  codes  were  added  for  this 
question,  bringing  the  total  number  of  codes  to  twelve, 
namely  A-L.    A  remaining  problem  is  that  if  the  multiple 
response  is  B+C+E,  it  is  recorded  as  F,  the  code  for 
B+C.    Other  than  this,  4  keystrokes  were  found  to  be  in 
error . 


Grade  8/Age  13: 

Question  15  required  similar  attention  at  this 
grade/age.    In  addition,  a  number  of  questions  with 
open-ended  response  alternatives  were  keyed  without  the 
corresponding  response  code  because  the  respondent  had 
neglected  to  circle  it.    The  result  could  have  been  the 
loss  of  the  write-in  response  if  a  database  user  were 
looking  for  the  response  code  instead  of  the  write-in 
response  itself;  accordingly,  all  of  these  response 
codes  were  added  to  the  dataset.    A  total  of  209 
questions  were  affected.    In  addition,  three  keystrokes 
were  found  to  be  in  error. 


Grade  1]/Age  17: 

Other  than  re-coding  of  multiple  responses  as  noted 
above,  only  one  keystroke  was  found  in  error. 

A  total  of  39,800  keystrokes  was  involved  in  the  sample  of  Excluded 
Student  Questionnaires  examined.    Disregarling  the  improvements  in  multiple 
response  coding  and  the  response  codes  adcjd  to  209  booklets  at  Grade  8/Age 
13,  there  were  actually  only  eight  keystrokes  in  error.    Applying  the  same 
analysis  of  this  error  rate  as  applied  above,  we  can  be  99.78  percent  sure 
that  the  true  error  rate  was  less  than  or  equal  to  .0005.    Although  this 
does  not  meet  the  standard  set  by  the  student  data  entry  operation,  it  is 
also  very  reassuring^ 


8.5.3    The  Teacher  Questionnaire  Data 

As  discussed  above,  this  questionnaire  also  exhibited  a  shortage  of 
special  codes  to  reflect  multiple  responses  which  were  far  more  common  than 
had  been  anticipated.    The  lists  of  multiple  response  codes  were  expanded 
for  a  number  of  items  and  the  additions  were  implemented  in  all  booklets; 
these  codes  are  defined  in  the  codebooks  accompanying  the  Public-Use  Data 
Tapes  Users'  Guide. 

Teacher  Questionnaires  were  randomly  sampled  at  the  rate  of  3  percent, 
or  one  booklet  out  of  33. 


ERLC 


207 


At  Grade  4/Age  9, 


26  booklets  checked  out  of 


1030 


At  Grade  8/Age  13, 


25 


fl 


II 


tl 


tl 


791 


At  Grade  11/Age  17 


30 


It 


It 


tt 


91A 


81 


2735  (2.962%) 


The  following  discoveries  and  changes  resulted  from  this  process: 
Grade  A/ Age  9: 

Zeros  and  dashes  were  found  to  be  used  in  an 
inconsistent  manner;  although  not  an  error  as  such,  to 
avoid  possible  confusion  it  was  decided  to  make  all 
booklets  conform  to  a  consistent  standard:  when  a 
respondent  reached  an  item  of  the  "circle-all-that- 
apply"  type,  a  zero  was  used  to  mean  an  alternative  did 
not  apply;  when  such  an  item  was  not  reached  or  should 
not  have  been  answered,  a  dash  was  used  to  indicate 
missing  data.    One  booklet  had  a  write-in  response  which 
was  re-interpreted.    Other  than  these  data  adjustments, 
only  three  keystrokes  were  found  to  be  in  error. 

Grade  8/Age  13: 

The  zere/dash  confusion  was  found  in  some  booklets 
and  changed.    Ac  in  the  Excluded  Student  Questionnaire 
for  Grade  8/Age  13,  a  number  of  questions  with 
open-ended  response  alternatives  were  k'.yed  without  the 
corresponding  response  code  because  t^e  respondent  had 
neglected  to  circle  it;  these  response  codes  were  added. 
Three  ite^*-    21,  22,  and  23,  were  lacking  a  response 
flag  posi.^wfn  in  the  booklet  though  one  had  been 
provided  in  the  data  layout;  since  the  codes  had 
therefore  not  been  keyed,  they  were  added  by  program. 
Seventeen  erroneous  keystrokes  were  found. 

Grade  11/Age  17: 

The  zero/dash  confusion  was  found  in  some  booklets 
and  changed.    Four  keystrokes  were  found  to  be  in  error. 

A  total  of  Al,398  keystrokes  was  involved  in  this  sample  of  Teacher 
Questionnaires.    Tventy-four  of  these  keystrokes  were  in  error.  The 
application  of  the  above-described  error  analysis  allows  us  to  say  that  we 
are  99.76  percent  sure  that  the  error  rate  is  less  than  or  equal  to  .0010. 
This  rate  is  twice  as  high  as  that  found  for  the  Excluded  Student 
Questionnaire  and  ten  times  as  high  as  that  found  for  the  student  data. 


208 


ERLC 


226 


Although  this  error  rate  is  perhaps  not  alarmingly  high  (it  suggests  99.9 
percent  "pure"  data),  it  does  reflect  a  characteristic  of  the  Teacher 
Questionnaires  that  was  observed  during  quality  control  and  editing 
operations:    this  instrument  seemed  to  be  unexpectedly  difficult  for 
teachers.    Again  and  again  strange  answers,  inconsistent  answers,  missing 
answers  and  mis-answered  questions  were  found  throughout  the  data.  Perhaos 
this  explains  the  relative  difficulty  of  keying  this  data  correctly. 

^•5-^    The  School  Characteristics  and  Policy  Questionnaire  Data 

This  questionnaire  suffered  somewhat  tcr  its  design  and  the  quality  of 
the  responses.    Two  items,  write-ins  dealing  vith  reading  programs,  could 
not  be  dealt  with  meaningfully.    A  number  of  questions  asking  for  percents 
were  answered  unpredictably:    N's  may  have  been  used  instead  of  percents: 
percents  or  proportions  may  have  been  used  instead  of  N's;  percents  were 
indicated  but  did  not  add  up  to  100;  proportions  were  confused  with 
percents.    Some  write-in  responses  were  too  long  to  be  accommodated  in  the 
fields  provided  in  the  database;  such  responses  can  only  serve  a  flagging 
purpose.    Also,  many  of  the  same  sorts  of  problems  were  encountered  here  as 
were  encountered  with  the  Teacher  Questionnaires. 

School  Characteristics  and  Policy  Questionnaires  were  randomly  sampled 
at  the  rate  of  5  percent,  or  one  booklet  out  of  20. 

At  Grade  4/Age  9,  30  booklets  checked  out  of  623 
At  Grade  8/Age  13,      27  "  "         "  459 

At  Grade  11/Age  17,    15  "  "  301 

72  1383    (5. 206%) 

The  Lollowing  discoveries  and  changes  resulted  from  this  process; 
Grade  4/Age  9: 

The  zero/dash  confusion  described  above  was 
encountered  with  some  frequency.     Some  additional 
multiple  response  codes  were  added.    Fifteen  keystrokes 
were  judged  to  be  in  error. 

Grade  8/Age  13: 

The  zero/dash  confusion  described  above  was 
encountered  vith  some  frequency.    Some  additional 
multiple  response  codes  were  added.    Ten  keystrokes  were 
judged  to  be  in  error. 


209 


Grade  11 /Age  17; 

The  zero/dash  confusion  described  above  was 
encountered  with  some  frequency.    Some  additional 
multiple  response  codes  were  added.    Only  one  keystroke 
was  in  error. 


A  total  of  31,536  keystrokes  was  involved  in  this  sample  of  School 
Characteristics  and  Policy  Questionnaires.    With  26  keystroke  errors,  we 
can  be  99.78  percent  sure  that  the  true  error  rate  is  less  than  or  equal  to 
.0014.    Some  of  the  factors  contributing  to  this  error  rate  have  been  noted 
above.    Again,  the  complexity  of  the  instrument  and  the  occasionally 
careless  manner  in  which  some  of  the  questions  were  answered  certainly 
added  to  the  difficulty  of  the  keying  operation.    While  this  error  rate 
does  not  meet  the  extremely  high  standard  set  by  the  data  entry  tor  the 
student  data,  it  does  indicate  a  level  of  excellence  seldom  encountered  in 
a  database  of  this  size. 


8.5.5    Summary  of  Error  Analysis 

The  quality  control  of  the  NAEP  data  for  Year  15  revealed  very  high 
standards  of  data  entry  for  all  instruments.    In  the  interests  of  making 
the  data  easier  to  interpret  and  preserving  more  of  the  complexity  of  the 
data,  some  changes  were  made  which  were  considered  improvements  rather  than 
correction  of  errors.  The  errors  that  were  discovered  led  to  the  following 
assessments  of  likely  error  rates. 


Error 
Rale 


Confidence 
True  Rate 
is  <  or  = 


Student  Data 


.0001 


99.89X 


Excluded  Student  Questionnaire 


.0005 


99.78 


Teacher  Questionnaire 


.0010 


99.76 


School  Characteristics  Questionnaire 


.0014 


99.78 


210 


2oU 


Chapter  8*6 
DATABASE  CREATION 

Alfred  M.  Rogers 
Educational  Testing  Service 


The  data  transcription  and  editing  procedures  described  in  Chapter  8,1 
resulted  in  the  generation  of  disk  and  tape  files  containing  various 
assessment  information-  Before  any  analysis  could  begin,  these  files  had  to 
be  pullea  together  into  a  comprehensive,  integrated  database.  Sampling 
weights  were  also  required  in  order  to  make  any  valid  statistical 
inferences  about  the  population  from  which  the  assessment  sample  was  drawn- 

This  chapter  describes  the:  processes  of  extraction  of  sample 
information  for  the  derivation  of  sampling  weights,  and  the  merging,  or 
bringing  together,  of  the  many  transcription  files  into  the  NAEP  database- 


8-6-1  Extraction 

For  each  grade/age  cohort,  four  sets  of  weights  were  required  to 
perform  inferential  analyses:  school  weights,  excluded  student  weights, 
student  weights,  and  teacher  weights-  Due  to  the  method  by  which  teachers 
were  selected,  sampling  weights  could  not  be  assigned  to  teachers,  but  were 
instead  assigned  to  students  who  were  linked  to  participating  teachers- 
(See  Chapter  7  for  more  details-) 

All  of  the  sample  information  was  extracted  from  the  data  files, 

edited,  and  transferred  to  tape  files  for  shipment  to  Vestat,  where  the 

weight  computation  was  performed-  The  editing  process  included  both  the 

validation  of  the  data  values  as  well  as  frequency  distribution  analyses  to 

be  compared  with  tracking  information  from  the  data  entry  system- 

The  school  sample  information  was  available  to  Vestat  from  the 
beginning  of  the  assessment-  They  did  not  require  any  addi:ional 
information  fron  ETS  to  compute  school  samp}e  weights- 

The  excluded  student  sample  information  was  extracted  from  the  Excluded 
Student  Questionnaire  data  file-  This  information  incl*    .-d:  booklet  serial 
number,  PSU  and  school  code,  grade,  sex,  birth  date,  race/ethnicity,  and  a 
code  indicating  reason  for  exclusion.  All  data  fields  were  taken  from  the 
front  cover  information  of  each  booklet,  except  for  this  exclusion  code, 
which  was  derived  from  the  response  to  Item  3  of  the  questionnaire-  A 


211 


listing  of  the  Excluded  Student  Questionnaires  which  had  not  been  received 
at  ETS  was  included  with  the  file  for  each  grade/age  cohort. 

The  student  sample  information  came  from  two  sources:  the  student 
database  and  the  absentee  file  from  the  administration  schedules.  The 
assessed  student  sample  information  included:  booklet  serial  number,  PSU 
and  school  code,  grade,  sex,  birth  date,  race/ethnicity,  and  teacher  code. 
Since  the  absent  students  were  not  observed  and  rot  assigned  an  assessment 
booklet,  the  booklet  serial  number,  race/ethnicit.y,  and  teacher  code  were 
not  available  for  the  absentee  data. 

The  absentee  file  had  to  be  adjusted  for  makeup  sessions.  The  field 
administration  procedures  required  scheduling  of  makeup  sessions  if 
absentee  rates  exceeded  certain  limits.  The  students  attending  these 
makeup  sessions  were  supposed  to  be  originally  sampled  students  who  were 
absent  for  the  regular  sessions.  Failure  to  remove  the  makeup  students  from 
the  absentee  file  would  have  resulted  in  incorrect  estimates  of  the  number 
of  students  in  those  schools.  This  problem  could  have  been  particularly 
r.cute  in  the  Grade  11/Age  17  sample  where  absentee  rates  were  high  and  many 
schools  required  makeup  sessions. 

The  first  step  in  the  removal  process  was  to  identify  the  students  i.-; 
the  student  file  who  attended  makeup  sessions  in  each  school.  Th-n,  for 
each  school  and  session  type  (spiral  or  tape),  the  sex,  grade,  and  birth 
dates  of  the  makeup  students  were  natched  with  those  of  the  absentee 
students  in  the  same  school  and  session  type.  The  absentees  identified  by 
perfect  matches  were  removed  from  the  absentee  file;  the  remaining 
unmatched  makeup  students,  if  any,  were  paired  with  randomly  selected 
absentees  who  were  then  removed  from  the  file.  This  latter  procedure  was 
necessary  only  for  the  Grade  11/Age  17  sample  in  only  a  few  of  the  many 
schools  which  had  makeup  sessions. 

The  teacher  sample  information  was  extracted  from  the  teacher 
questionnaire  data  file.  It  consisted  of  only  the  PSU,  school,  and  teacher 
codes  from  the  questionnaire  booklet  covers.  Westat  used  this  information 
in  conjunction  with  the  student  sample  information  to  produce  a  file  of 
student-based  teacher  weights. 


8.6.2    File  Merging 

The  transcription  process  resulted  in  the  generation  of  five  data  files 
for  each  grade/age  cohort:  one  file  for  each  of  the  three  questionnaire 
instruments,  the  student  response  data  file  from  the  data  entry  system,  and 
the  student  reading  and  writing  scores  from  professional  scoring  and  key 
entry.  The  sample  weight  derivation  process  produced  an  additional  four 
files  of  sampling  weights.  To  perform  data  analysis,  these  files  had  to  be 
integrated  into  a  coherent  and  comprehensive  database. 

This  database  would  ultimately  consist  of  four  files  per  cohort: 
school,  teacher,  excluded  student,  and  student  files.  The  student  file 
would  contain  all  five  student  samples:  the  spiral  and  four  tape  samples. 


212 


ERIC 


234 


The  school  file  could  be  linked  to  the  other  three  files  through  the  PSU 
and  school  codes.  The  teacher  file  could  be  linked  to  the  student  spiral 
sample  through  the  PSU,  school  and  teacher  codes. 

The  school  file  was  created  by  merging  the  School  Questionnaire  file 
with  the  school  weights  file.  The  PSU  and  school  code  were  used  as  the 
matching  criterion.  Each  record  of  the  resulting  file  was  formed  by 
concatenating  the  weight  information  with  the  response  data.  Since  not  all 
schools  returned  their  questionnaires,  some  of  the  output  records  contained 
only  weight  information. 

The  teacher  file  was  generated  from  the  Teacher  Questionnaire  file. 
Since  the  teacher  weights  were  derived  at  the  student  level,  no  information 
had  to  be  added  to  the  questionnaire  data. 

The  excluded  student  file  was  the  result  of  merging  the  Excluded 
Student  Questionnaire  file  with  the  excluded  student  weights  file.  The 
booklet  serial  number  was  used  as  the  matching  criterion. 

The  creation  of  the  student  data  file  was  a  three-stage  process, 
merging  the  professionally  scored  items,  student  weights,  and  teacher-based 
student  weights  with  the  student  response  data,  in  that  order.  In  all  three 
procedures,  the  booklet  serial  number  was  used  as  the  matching  criterion. 
The  merging  of  the  professionally  scored  item  data  was  a  more  complex 
procedure  than  the  others,  because  the  set  of  scores  for  each  item  within  a 
booklet  were  inserted  into  the  response  data  fields  in  the  order  in  which 
the  items  appeared  in  the  booklet. 

The  database  was  then  ready  for  analysis.  As  new  data  values  and  scores 
were  derived,  they  were  added  to  the  relevant  files  using  the  same  matching 
procedures  as  described  above.    The  public-use  data  tapes  files  were 
ultimately  generated  from  this  database. 


8.6.3    Master  Catalog 

A  critical  part  of  any  database  is  the  processing  control  and 
descriptive  information.    A  central  repository  of  this  information  may  be 
accessed  by  all  analysis  and  reporting  programs  to  provide  correct 
parameters  for  processing  the  data  fields  as  well  as  consistent 
identification  labeling  of  the  analysis  results.    The  master  catalog  file 
was  designed  and  constructed  to  serve  both  of  these  purposes. 

Each  record  of  the  master  catalog  contains  the  processing,  labeling, 
classification,  and  location  information  for  each  data  field  in  the 
database.    The  control  parameters  are  used  by  the  access  routines  in  the 
analysis  programs  to  define  the  manner  in  which  the  raw  data  values  are  to 
be  transformed  and  processed. 

All  data  fields  have  a  50-character  label  in  the  catalog  describing  the 
contents  of  the  field  and,  where  applicable,  the  source  of  the  field.  The 
data  fields  with  discrete  or  categorical  values  have  additional  label 


213 


23j 


fields  in  the  catalog  containing  the  permitted  values  and  8-  and 
20-character  labels  for  those  values. 


The  classification  area  of  the  catalog  record  contains  distinct  fields 
corresponding  to  pre-defined  classification  categories  for  the  data  fields. 
For  a  given  classification  field,  a  non-blank  value  indicates  the  code 
within  that  classification  category  for  the  data  field.    This  permits  the 
collection  of  identically  classified  items  or  data  fields  by  performing  a 
selection  process  on  one  or  more  classification  fields  in  the  catalog. 

According  to  the  NAEP  design,  it  is  possible  for  item  data  fields  to 
occur  in  more  than  one  age  assessment  and  more  than  one  block  within  each 
age.    The  location  fields  of  the  catalog  record  contain  the  age,  block  and, 
where  applicable,  the  item  sequence  number  within  block  of  each  occurrence 
of  the  data  field  throughout  the  Year  15  database. 

The  master  catalog  file  was  constructed  in  parallel  with  the  collection 
and  transcription  of  the  assessment  data  to  be  ready  for  use  by  analysis 
programs  when  the  database  was  created.    As  new  data  fields  were  derived 
and  added  to  the  database,  their  descriptive  and  control  information  was 
entered  into  the  catalog. 

One  of  the  most  important  uses  of  the  master  catalog  was  the  control  of 
the  creation  of  the  public-use  data  tapes  files  as  well  as  the  codebooks 
and  file  layouts.    A  synopsis  of  this  process  is  presented  in  the  next 
chapter. 


214 


ERLC 


Chapter  8.7 
PUBLIC-USE  DATA  TAPE  CONSTRUCTION 


Alfred  M,  Rogers 
Educational  Testing  Service 


The  public-use  data  tapes  (P'JDTs)  are  designed  to  permit  any  research 
individual  or  organization  with  an  interest  in  the  National  Assessment  to 
perform  secondary  analysis  on  the  same  data  as  that  used  at  ETS,  This 
section  discuss^*;  some  of  the  issues  raised  during  the  creation  of  the 
data,  and  summarizes  the  procedures  followed  in  generating  the  data  and 
related  materials. 

The  three  elements  of  the  distribution  package  are  the  data  tapes, 
printed  documentation,  and  microfiche  of  the  assessment  instruments.  Each 
grade/age  cohort  is  represented  on  a  separate  tape,  with  each  tape 
containing  the  data  files;  a  set  of  SPSS-X  control  statement  files  for 
generating  an  SPSS-X  system  file  for  each  Jata  file;  a  set  of  SAS  control 
statement  files  for  generating  a  SAS  system  file  for  each  data  file;  and  a 
set  of  machine-readable  catalog  files  containing  control  and  descriptive 
information  for  each  data  file,  for  the  non-SPSS-X  and  non-SAS  user.  The 
printed  documentation  consists  of  four  volumes:  a  guide  to  the  use  of  the 
data  files,  and  a  set  of  file  layouts  and  codebooks  for  the  data  fi?es 
within  each  of  the  three  cohort?  (see  NAEP  1983-8A  Public-Use  Data  Tapes 
Version  3.1  Users'  Guide  [Barone,  Norris,  &  Rogers,  1986]). 


8-7.1    File  Definition 

The  organization  and  format  of  the  data  files  to  be  produced  was  the 
first  issue  to  be  addressed.  The  ETS  database  consisted  of  four  data  files 
for  each  grade/age  cohort,  corresponding  to  the  three  questionnaire 
instruments  and  the  student  database,  incorporating  the  spiral  and  all  four 
tape  samples.  The  logical  relationship  of  the  data  files  was  a  three-level 
hierarchy,  with  the  five  student  and  the  excluded  student  samples  at  the 
bottom  level;  the  teacher  sample  at  the  next  level,  with  a  linkage  only  to 
the  spiral  sample;  and  the  school  sample  at  the  top,  with  direct  linkages 
to  all  samples  below  it.  A  linkage  ma>  be  viewed  as  a  one-to-many  mapping 
of  the  records  within  the  two  files  linked.  For  example,  one  school  record 
is  linked  to  one  or  more  records  in  the  teacher  file,  and  each  of  these 
teacher  records  are  in  turn  linked  to  one  or  more  records  in  the  spiral 
student  file. 


215 


One  organization  scheme  has  six  files  corresponding  to  the  six  samples 
at  the  bottom  level,  with  the  data  from  the  higher  order  samples  appended 
to  and  repeated  across  as  many  of  the  lower  ]evel  records  as  requited  by 
the  linkages.  Using  the  previous  example,  each  spiral  sample  record  would 
be  appended  by  its  corresponding  teacher  record  and  school  record.  This 
epproach  places  no  demand  on  the  user  to  define  the  linkages  since  each 
data  record  is  complete,  but  it  requires  substantially  more  computer 
storage  space  due  to  the  larger  record  size. 

An  alternative  scheme  would  have  these  same  six  samples  without  the 
appended  teacher  and  school  data.  The  teacher  and  school  samples  would 
reside  in  their  own  files,  with  special  data  fields  in  all  files  to 
facilitate  their  linkage  through  program  control.  At  the  expense  of  a 
little  more  sophistication  on  the  part  of  the  user,  this  approach  is  more 
economical  in  computer  resource  utilization.  This  potential  for  savings  on 
computer  storage  and  processing  costs  was  the  overriding  consideration  in 
using  this  scheme. 


8,7,2    Variable  Definition 

The  selection  and  arrangement  of  variables,  or  data  fields,  in  each 
file  was  the  next  order  of  business.  The  first  step  in  the  decision  process 
was  the  generation  of  a  file  of  variable  descriptors  for  each  data  file  to 
be  created.  Each  of  these  LABELS  files  contained  one  record  for  each 
variable,  each  record  containing  the  variable  name,  a  short  text 
description  of  the  variable,  and  processing  control  information  to  be  used 
by  later  steps  in  the  PUDT  process.  This  file  could  be  edited  for  deletion 
of  variables,  modification  of  control  parameters,  or  re-ordering  of  the 
variables  within  the  file. 

The  first  program  in  the  processing  stream,  GENLYT,  produced  a  printed 
layout  for  each  file  from  the  information  in  its  corresponding  LABELS  file. 
These  layouts  were  initially  reviewed  for  the  selection  and  ordering  of  the 
variables.  The  variables  which  were  excluded  from  PUDT  processing  fell 
primarily  into  two  categories:  non-applicable  and  confidential. 

The  non-applicable  variables  were  found  mostly  in  the  student  database. 
Since  the  tape  samples  were  combined  with  the  spiral  sample,  many  of  the 
variables  which  applied  to  the  spiral  students  did  not  apply  to  the  tape 
students,  and  vice  versa.  For  example,  the  teacher  code  and  the 
student-based  teacher  weights  vere  used  for  the  analysis  of  spiral  sample 
data,  but  were  not  in  the  design  at  all  for  the  tape  sample. 

The  confidential  variables  included  any  descriptor  or  code  which  could 
be  used  to  identify  individual  states,  schools,  or  students  in  the  NAEP 
sample.  The  PSU,  school,  teacher,  and  stuHant  identification  codes  used 
internally  by  ETS  and  WESTAT  were  "scrambled"  according  to  specific 
algorithms  to  obtain  new  codes  for  use  in  linking  the  files  together. 

Another  confidentiality  problem  arose  in  the  response  data,  where  the 
students  were  asked  to  identify  the  state  they  had  lived  in  four  years  ago. 


216 


A  new  variable  was  created  using  the  response  code  and  current  state 
residency  information  from  the  PSU  code  to  indicate  if  the  student  had 
lived  in  the  same  state,  the  same  region,  or  a  different  region. 

The  ordering  of  the  variables  within  the  data  files  followed  a  general 
trend  of  decreasing  likelihood  of  usage:  identification  information 
preceded  weights,  scores,  and  other  derived  variables,  which  were  followed 
by  the  response  data.  The  identification  variables  were  generally  those  on 
the  front  covers  of  the  instruments.  The  derived  variables  included  the 
sampling  weights,  IRT  scale  values,  and  variables  derived  from  the  response 
data  or  other  sources  for  reporting  purposes.  The  response  data  variables 
were  arranged  according  to  their  order  in  the  instrument. 

The  spiral  sample  posed  an  additional  problem  because  it  entailed  the 
expression  of  63  different  booklet  formats  into  a  single,  fixed  format.  The 
solution  lay  in  arranging  the  data  "blocks"  in  order  corresponding  to  their 
letter  designations.  The  common  background  questionnaire  preceded  the  first 
spiral  block  in  the  new  record.  Each  data  record  from  the  input  student 
base  was  reformatted  according  to  its  booklet  number;  the  data  for  its 
constituent  blocks  were  moved  into  their  assigned  locations  in  the  output 
record.  The  remaining  data  block  areas  contained  blank  fields,  indicating 
that  the  data  was  missing  by  design. 

T'he  spiral  design  also  created  a  problem  from  the  user's  standpoint: 
how  to  determine,  from  a  given  booklet  record,  which  data  blocks  were 
present  and  their  relative  order  in  the  instrument.  This  problem  was 
remedied  by  the  creation  of  a  set  of  control  variables,  one  for  each  block, 
which  indicated  not  only  the  presence  or  absence  of  the  block  but  its  order 
in  the  instrument.    These  control  variables  were  included  in  the  section  of 
derived  variables. 


8.7.3    Data  Definition 

To  enable  the  data  files  to  be  processed  on  any  computer  system  using 
any  procedural  or  programming  language,  it  was  desirable  that  the  data  be 
expressed  in  numeric  format.  With  the  exception  of  a  handful  of  open-ended 
responses,  this  was  possible,  but  not  without  the  adoption  of  certain 
conventions  for  re-expressing  the  data  values. 

As  mentioned  in  Chapter  8.3,  Data  Entry,  the  responses  to  all 
multiple-choice  items  were  transcribed  and  stored  in  the  database  using  the 
letter  codes  printed  in  the  instruments.  This  scheme  afforded  the  advantage 
of  saving  storage  space  for  items  with  ten  or  more  response  options,  but  at 
the  expense  of  translating  these  codes  into  their  numeric  equivalents  for 
analysis  purposes.  The  response  data  fields  for  most  of  these  items  would 
require  a  simple  alphabetic-to-numeric  conversion.  However,  the  data  tields 
items  with  ten  or  more  response  choices  would  require  "expansion"  before 
the  conversion,  since  the  numeric  value  would  require  two  column  positions. 
One  of  the  processing  control  parameters  on  the  LABELS  file  indicates 
whether  or  not  the  data  field  is  to  be  expanded  before  conversion  and 
output. 


217 


The  ETS  database  contained  special  codes  to  indicate  certain  response 
conditions:  no  resoonse,  "I  don't  know"  response,  multiple  response,  and 
unresolvable  response.  The  primary  trait  scores  for  the  reading  essay  items 
included  ^idditional  special  codes  for  ratings  of  "illegible"  and  "off  task 
by  the  scorers.  A  final  special  code  was  assigned  to  the  iten-.s  which,  due 
to  printing  error,  did  not  appear  in  some  of  the  booklets  at  all.  These 
codes  had  to  be  re-expressed  in  numeric  format. 

A  convention  used  by  ECS  in  the  creation  of  their  Public-Use  Data  Tapes 
was  adopted  and  enhanced  in  the  designation  of  these  codes.  The  "I  don't 
know"  response  was  always  coded  as  7.  The  "no  response"  code  was  8.  The 
multiple  or  otherwise  indeterminate  response  received  a  code  of  9.  For  the 
primary  trait  scores,  an  "illegible"  score  was  coded  as  5  and  the  "off 
task"  score  as  6.  The  very  small  number  of  "missing"  responses  were  coded 
as  blank  fields,  corresponding  to  a  "missing  by  design"  designation. 

This  coding  scheme  created  conflicts  for  those  items  which  had  seven  or 
more  valid  responses  as  well  as  the  "I  don't  know"  response,  and  the 
primary  trait  items  with  five  or  more  scoring  categories.  These  items  also 
required  expansion  to  accommodate,  the  valid  responses  values.  The  special 
codes  were  "extended"  to  fill  the  output  data  field,  e.g.  the  "I  don't 
know"  code  was  extended  from  7  to  77,  the  "no  response"  code  from  8  to  88, 
etc. 

The  numeric  variables  on  the  tape  files  fall  into  two  categories: 
continuous  and  discrete.  The  continuous  variables  include  the  weights,  IRT 
values,  identification  codes,  and  item  responses  where  counts  or 
percentages  were  requested.  The  discrete  variables  intrude  those  items  for 
which  each  numeric  value  corresponds  to  a  response  category.  This 
designation  also  includes  those  derived  variables  to  which  numeric 
classification  categories  have  been  assigned.  The  open-ended  short  response 
items  were  to  be  transferred  with  no  conversion,  and  were  classified  as 
alpha- type  variables. 


8. 7. A    Data  File  Layouts 

The  data  file  layouts,  as  mentioned  above,  were  the  first  user  product 
to  be  generated  in  the  PUDT  process.  The  generation  program,  GENLYT,  used  a 
LABELS  file  a:^  input  and  produced  a  printable  file..  This  LAYOUT  file  is 
little  more  than  a  formatted  listing  of  the  LABELS  file. 

Each  line  of  the  LAYOUT  file  contains  the  following  information  for  a 
single  data  field:  sequence  number,  field  name,  output  column  position, 
field  width,  number  of  decimal  places,  data  type,  value  range,  key  or 
correct  response  value,  and  a  short  description  of  the  field.  The  secjuence 
number  of  each  field  is  implied  from  its  order  on  the  LABELS  file.  The 
field  name  is  an  8-character  label  for  the  field  which  is  to  be  used 
consistently  by  all  PUDT  materials  to  refer  to  that  field  on  that  file.  The 
output  column  position  is  the  relative  location  of  the  beginning  of  that 
field  on  each  record  for  that  file,  using  bytes  or  characters  as  the  unit 


218 


ERIC 


238 


of  measure.  The  field  width  indicates  the  number  of  columns  used  in 
representing  the  data  values  for  a  field.  If  the  field  contains  continuous 
numeric  data,  the  number  of  decimal  places  value  indicates  how  many  places 
to  shift  the  decimal  point  before  processing  data  values. 

The  data  type  category  uses  three  codes  to  designate  the  nature  of  the 
data  in  the  field:  alpha-numeric  data  are  coded  "A";  continuous  numeric 
data  are  coded  "C";  discrete  numeric  data  are  coded  "D".  Additionally,  the 
discrete  numeric  fields  which  include  "I  don't  know"  response  codes  are 
coded  "DI".  If  the  field  type  is  discrete  numeric,  the  value  range  is 
listed  as  the  minimum  and  maximum  permitted  values  separated  by  a  hyphen  to 
indicate  range.  If  the  field  is  a  scorable  item  response,  the  correct 
response  value,  or  key,  is  printed.  A  range  of  correct  responses  was 
indicated  for  those  professionally  scored  items  which  received  cut-point 
scoring  for  IRT  scaling.  Finally,  each  variable  was  further  identified  by  a 
50-character  descriptor. 


8.7.5    Data  File  Catalogs 

The  LABELS  file  contains  sufficient  descriptive  information  for 
generating  a  brief  layout  of  the  data  file.  However,  to  generate  a  complete 
codebook  document,  substantially  more  information  about  the  data  is 
required.  This  function  is  filled,  in  part,  by  the  CATALOG  file. 

The  CATALOG  file  is  created  by  the  CATGEN  program  from  the  LABELS  file 
and  the  Year  15  master  catalog  file.  Each  record  on  the  LABELS  file 
generates  a  CATALOG  record  by  first  retrieving  the  meister  catalog  record 
corresponding  to  the  field  name.  The  master  catalog  record  contains  usage, 
classification,  and  response  code  information.  This  record  is  prefixed  by 
the  positional  information  from  the  LABELS  file:  field  sequence  number, 
output  column  position,  and  field  width. 

The  response  code  information,  also  referred  to  as  "foils",  consists  of 
the  possible  data  values  for  the  discrete  numeric  fields,  and  a 
20-character  description  of  each  one>  The  CATGEN  program  uses  additional 
control  information  from  the  LABELS  file  to  determine  if  extra  foils  should 
be  generated  and  saved  with  each  CATALOG  record.  The  first  control 
parameter  or  "flag"  indicates  a  primary  trait  score  field,  for  which  the 
"illegible"  and  "off  task"  codes  and  foils  should  be  generated.  The  second 
flag  controls  generation  o?.  the  "I  don't  know"  foil.  The  third  flag 
regulatfes  "no  response"  foil  generation,  and  the  fourth  flag  denotes  the 
possibility  of  multiple  or  out-of-range  responses  for  that  field  and  sets 
up  an  appropriate  foil.  All  of  these  control  parameters,  including  the 
expansion  flag,  may  be  altered  in  the  LABELS  file  by  use  of  a  text  editor 
to  suit  the  data  behavior  for  any  given  field. 

The  LABELS  file  supplies  control  information  for  many  of  the  subsequent 
PUDT  processing  steps.  The  CATALOG  file  provides  the  detail  information  for 
those  same  steps  and  for  others  as  well. 


219 


ERLC 


23J 


8.7.6  Codebooks 


The  data  file  codebook  is  designed  as  a  printed  document  containing 
complete  descriptive  information  for  each  data  field.  Most  of  this 
information  derives  from  the  CATALOG  file;  the  remaining  data  came  from  two 
other  files:  the  COUNTS  file  and  the  IRT  parameters  file. 

Each  data  field  receives  at  least  one  line  of  descriptive  information 
in  the  codebook.  If  the  data  type  is  either  alphabetic  or  continuous 
numeric,  no  more  detail  is  given.  If  the  variable  is  discrete  numeric,  the 
codebook  lists  the  foil  codes,  foil  labels,  and  frequencies  of  each  value 
in  the  data  file.  Additionally,  if  the  field  represents  an  item  used  in  IRT 
scaling,  the  codebook  lists  the  parameters  used  by  the  scaling  program. 

The  frequency  counts  are  not  available  on  the  catalog  file,  but  must  be 
generated  from  the  data  itself.  Tne  GENFREQ  program  created  the  COUNTS  fUe 
using  the  field  name  to  locate  the  variable  in  the  database,  and  the  foil 
values  to  validate  the  range  of  data  values  for  each  field.  This  program 
also  serves  as  a  check  on  the  completeness  of  the  foils  in  the  CATALOG 
file,  as  it  flags  any  data  values  not  represented  by  a  foil  value  and 
label. 

The  IRT  parameter  file  is  linked  to  the  CATALOG  file  through  the  field 
name.  Printing  of  the  IRT  parameters  is  governed  by  a  control  flag  in  the 
classification  section  of  the  CATALOG  record. 

The  LAYOUT  and  CODEBOOK  files  are  written  by  their  respective 
generation  programs  to  print-image  disk  data  files.  Draft  copies  are 
printed  and  distributed  for  review  before  the  production  copy  is  generated. 
The  production  copy  is  printed  on  an  IBM  3800  printing  sub-system  using 
laser-imaging  technology.  The  printing  is  performed  at  15  characters  per 
horizontal  inch  (pitch)  and  8  lines  per  vertical  inch.  This  accommodates 
printing  of  115  characters  per  line  and  80  lines  per  page  on  standard 
8-1/2"  X  11"  paper. 


8.7.7    SAS  and  SPSS-X  Control  Files 

The  SAS  and  SPSS-X  control  statement  files  are  provided  to  the  user  as 
a  means  for  converting  the  raw  data  files  directly  into  a  system  file  for 
subsequent  analyses  under  either  package.  The  files  are  very  similar  in 
their  content  and  structure,  although  actual  implementation  of  their 
features  differ  slightly.  Two  separate  programs,  GENSAS  and  GENSPX  generate 
the  control  files  using  the  CATALOG  file  as  input. 

Each  of  the  control  files  contain  separate  sections  for  variable 
definition,  variable  labeling,  missing  value  declaration,  value  labeling, 
and  creation  of  scored  variables  from  the  cognitive  items.  The  variable 
definition  section  describes  the  locations  of  the  fields,  by  name,  in  the 
file,  and,  if  applicable,  the  number  of  decimal  places  or  type  of  data.  The 
variable  label  identifies  each  field  with  a  50-character  description.  The 
missing  value  section  declares  which  values  of  which  variables  are  to  be 


220 


treated  as  missing  and  excluded  from  analyses.  The  value  labels  correspond 
to  the  foils  in  the  CATALOG  file.  The  code  values  and  their  descriptors  are 
listed  for  each  discrete  numeric  variable.  The  scoring  section  is  provided 
to  permit  the  user  to  generate  item  score  variables  in  addition  to  the  item 
response  variables. 

Each  of  the  code  generation  programs  combine  three  steps  into  one 
complex  procedure.  As  each  CATALOG  file  record  is  read,  it  is  broken  into 
several  component  records  according  to  the  information  to  be  used  in  each 
of  the  resultant  sections.  These  record  fragments  are  tagged  with  the  field 
sequence  number  and  a  section  sequence  code.  They  are  then  sorted  by 
section  code  and  sequence  number.  Finally,  the  reorganized  information  is 
output  in  a  structured  format  dictated  by  the  syntax  of  the  processing 
language . 

The  generation  of  the  system  files  accomplishes  the  testing  of  these 
control  statement  files.  Thes3  files  are  saved  for  use  by  internal  ETS 
users  of  the  NAEP  data. 


8.7.8    Machine-Readable  Catalog  Files 

For  those  NAEP  data  users  who  have  neither  SAS  nor  SPSS-X,  yet  require 
processing  control  information  in  a  computer-readable  format,  the 
distribution  tape  also  contains  machine-readable  catalog  (CAT)  files. 
In  addition  to  processing  control  information,  each  CAT  record  contains  the 
IRT  parameters  and  the  foil  codes  and  labels. 


221 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 


PART  II 


Chapter  9 


OVERVIEW  OF  PART  II: 
THE  NAEP  1983-84  DATA  ANALYSIS 

Albert  E.  Beaton 
Educational  Testing  Service 


The  purpose  of  this  chapter  is  to  present  an  overview  of  the  procedures 
used  in  the  analysis  of  the  NAEP  Year  15  (1983-8A)  data.  These  procedures 
were  used  for  the  parameter  estimates  which  were  reported  in  The  Reading 
Report  Card:    Progress  Toward  Excellence  in  Our  Schools  (1985y71/ri ting: 
Trends  Across  the  Decade^  1974-84  (Applebee,  Langer.  &  Hullis,  1§66),  and 
other  reports  which  have  been  prepared  or  are  in  preparation.  The  details 
of  the  analytic  procedures  are  described  in  detail  in  the  rest  of  Part  II 
of  this  technical  report. 

This  second  part  of  the  technical  report  assumes  the  existence  of  a 
carefully  edited  database,  thus  does  nojt  cover  the  operations  involved  in 
c  ^structing  the  database,  which  are  are  discussed  in  Part  I.  The  reader 
Suould  consult  Part  I  of  this  report  for  information  about  the  NAEP  data, 
including; 

an  overview  of  the  operations  involved  in  collecting  and 
editing  the  data  (Chapter  2); 

the  development  of  the  reading  and  writing  exercises  (Chapter 


•  the  stratified  random  sampling  plan  (Chapter  4); 

the  assignment  of  exercises  to  students  (Chapter  5); 

•  instrument  and  item  information  (Chapter  6); 

the  field  administration  procedures  (Chapter  7);  and 

the  flow  of  data  from  the  field  to  an  edited  database  and 
public-use  data  tapes  (Chapter  8). 

Sections  9.1  through  9.6  below  follow  the  sequence  of  the  remaining 
chapters  in  Part  II  of  the  technical  report: 

the  reading  data  analysis,  including  the  study  of 
dimensionality,  differences  between  printed  and  tape  recorded 


225 


administration,  maximum  likelihood  estimation,  marginal 
estimation,  conditioning,  plausible  values,  trend  analysis, 
and  behavioral  anchoring  (Chapter  10); 

the  writing  data  analysis,  including  reliability,  differences 
between  printed  and  tape  recorded  administration,  trend 
analysis,  ARM  scaling,  conditioning,  and  plausible  values 
(Chapter  11); 

the  background  and  attitude  data  analysis,  including  the 
definition  of  reporting  variables  and  WARM  scaling  procedure 
(Chapter  12); 

•  the  estim?tion  of  population  parameters,  including  sampling 
weights,  estimation  of  sampling  error,  estimating  measurement 
error,  and  the  NAEP  tabular  results  (Chapter  13);  and 

♦  some  supplementary  studies,  including  the  validity  of  the  NAEP 
data  and  the  design  effects  for  Year  15  NAEP  (Chapter  14). 


★  ★  * 

Before  discussing  the  data  analytic  procedures,  it  may  be  useful  to 
make  some  general  comments  about  the  NAEP  data  analysis.  The  NAEP  data 
analyses  were  guided  by  a  number  of  priorities:  accuracy,  interpretabili ty , 
public-usefulness  of  the  data,  and  timeliness  of  reporting.  There  were  also 
a  number  of  constraints  such  as  keeping  a  student's  burden  under  an  hour, 
maintaining  trends,  collecting  data  in  the  schools  within  a  few  ironths  of 
receiving  the  grant,  and,  of  course,  keeping  within  a  very  tight  budget. 
These  priorities  and  constraints  often  conflicted  and  required 
improvisat  ion. 

An  example  of  conflict  occurred  during  the  assembly  of  test  booklets. 
The  ETS  design  called  for  random  subsamples  of  students  to  be  administered 
booklets  of  about  25  exercises  each,  a  number  sufficient  to  permit  a 
reasonably  precise  estimate  of  reading  proficiency  from  each  sampled 
student.  Within  the  tight  transition  and  assessment  timelines,  however,  the 
target  of  25  exercises  per  pupil  could  not  be  i^et,  and  it  was  not  possible 
to  obtain  precise  estimates  of  proficiency  for  all  students.  Because  it  is 
population-level  characteristics  rather  than  individual  student 
characteristics  that  are  of  interest  in  NAEP,  the  possibility  of  fulfilling 
NAEP's  function  remained  open— but  only  if  new  techniques  could  be 
developed  to  produce  estimates  of  population  attributes  directly,  without 
the  (now  impossible)  intermediate  step  of  computing  scores  for  everyone  in 
the  sample. 

The  major  tool  used  in  computing  consistent  estimates  of  group 
performance  was  plausible  values,  an  adaptation  by  Mislevy  (1985a)  of  a 
method  of  handling  missing  data  originally  proposed  by  Rubin  (1977,  1978). 
The  idea  is  that  although  we  do  not  know  an  individual's  proficiency,  we 


226 


ERIC 


244 


can  estimate  a  distribution  of  plausible  scores  for  each  individual,  given 
the  available  data,  and  that  this  distribution  represents  both  what  we  do 
know  and  what  we  do  not  know  about  the  individual's  proficiency.  UcAng  a 
random  selection  from  the  distribution  of  plausible  values  of  e:ich 
individual,  it  is  possible  to  make  consistent  estimates  of  selected 
population  parameters.  The  variation  of  results  from  one  set  of  random 
selections  to  another  is  an  estimator  of  the  error  due  to  imprecise 
measurement.  In  practice,  we  generate  five  plausible  reading  and  writing 
values  for  each  individual  who  was  administered  exercises  in  the  area  and 
then,  to  estimate  measurement  error  variance»  compute  each  paiameter 
estimate  five  times. 

This  method  of  estimation  does  not  in  general  give  consistent  results 
for  all  possible  subpopulations ,  and  will  not  unless  the  classification 
variables  corresponding  to  the  subpopulations  are  explicitly  conditioneo  on 
in  the  process  of  creating  the  plausible  values.  Ve  therefore  conditioned 
on  as  many  variables  as  our  technology  would  allow,  which  were  the  major 
NAEP  reporting  variables  (e.g.,  sex,  race/ethnicity).  Beaton,  Mislevy, 
Kaplan  and  Sheehan  (1986)  demonstrated  the  process  using  data  available  on 
the  SAT  Public-Use  Tape  and  showed  the  importance  of  conditioning  on 
subpopulation  membership.  Since  then,  the  possible  biases  i.  olved  m  "sing 
unconditioned  variables  have  been  studied  extensively.  The  results  are 
reported  in  the  next  two  chapters. 

The  development  of  the  concept  of  plausible  values  for  larf'e  scale 
surveys  has  had  several  side  benefits.  Already,  ve  have  been  abl,>  to  place 
the  reading  data  from  past  assessments  onto  the  Year  15  scale  for  trend 
analysis,  where  the  dati  might  otherwise  have  been  too  sparse  for  scaling. 
Since  fever  exercises  are  required  for  group  estimates,  limited  assessment 
time  can  be  used  to  estimate  several  subscales  in  a  learning  area,  thereby 
reducing  dependence  on  the  assumption  '^f  unidimensionali ty  (see  Zwick, 
Chapter  10.1).  Perhaps  most  important  is  that  the  plausible  values  force  an 
analyst  to  consider  an  important  problem  which  is  hidden  in  much  survey 
research:  the  problem  of  fallible  measurement. 

All  educational  measurement,  indeed  all  measurement,  :^  subject  to 
error,  and  this  error  affects  the  assessment  of  relationships  which  are 
made  from  the  data.  This  phenomenon  affects  all  analyses  of  educational 
survey  data.  Ve  have  not  studied  whether  or  not  ot'ser  national  surveys  have 
more  or  less  measurement  error  than  NAEP,  but  some  such  error  surely 
exists.  The  method  of  plausible  values  is  an  attempt  to  improve  estimation, 
given  the  fallibility  of  the  data. 

Let  us  consider  an  assessment  design  which  might  have  been  used  instead 
of  the  present  NAEP  design.  Ve  might  have  administered  a  short  test  of 
reading  and  writing  to  all  students,  at  least  at  a  particular  grade/age 
combination.  The  most  obvious  losses  would  be  the  broad  coverage  of  the 
subject  areas,  the  links'  to  past  asse^jsni^nts ,  and,  unless  the  tests 
contained  substantial  overlap  at  different  grades,  the  linking  across 
grade/ages.  The  measurement  of  an  individual's  proficiency  would  s.ill  be 
imprecise;  the  amount  of  error  would  depend,  among  other  things,  on  how 
many  items  were  offered,  how  many  items  a  student  attempted,  and  the 


227 


selection  of  the  items  themselves,  especially  if  a  serious  ceiling  and 
floor  effect  w  :^  present.  Such  a  survey  would  offer  each  student  the  same 
number  of  item^  in  a  subject  area,  whereas  the  NAEP  design  offers  some 
students  many  items  and  other  students  only  a  few.  Neither  the  hypothetical 
survey  design  nor  the  NAEP  design  is  assured  of  an  adequate  range  of  items 
nor  can  either  assure  that  the  students  will  respond  to  all  of  the  items 
offered.  We  woul^il  expect  that  the  measurement  error  from  subject  to  subject 
would  vary  less  in  the  simpler  design  than  in  the  NAEP  design,  which  has 
both  quite  precise  and  imprecise  subject  measurement.  In  both  cases, 
ignoring  the  measurement  error  is  done  at  the  analyst's  peril,  since  the 
error  will  result  in  biased  results. 

The  method  of  plausible  values  can  be  used  in  either  case  to  reduce  the 
bias  due  to  error  of  measurement.  If  the  assumptions  of  the  plausible  value 
models  are  met,  the  bias  in  parameter  estimates  approaches  zero  as  the 
sample  size  approaches  infinity. 

The  measurement  error  problem,  and  several  other  data  analytic 
problems,  have  been  known  for  years  and  affect  analyses  of  any  educational 
data,  including  the  analyses  that  we  have  done  of  NAEP  data  and  the 
analyses  that  others  may  do  using  these  data  in  the  future.  The  question 
might  arise  as  to  whether  these  data — indeed  any  survey  data — should  ever 
be  used  at  all.  The  value  decisions  involved  in  using  imprecise  data  were 
well  described  twenty  years  ago  in  Equality  of  Educational  Opportunity 
(Coleman,  Campbell,  Hobson,  McPartland,  Mood,  Weinfeld,  &  York,  196^), 
which  is  better  known  as  the  Coleman  Report,  The  Technical  Appendix  to 
Section  3,2  states  on  page  326: 


There  are  three  central  facts  to  be  remembered  throughout  any 
analysis  of  the  sort  here  conducted: 

1.  The  measurement  of  either  any  single  variable  or  any 
class  of  variables  is  at  best  partial  and  incomplete, 

2.  When  two  variables  (or  two  sets  of  variables)  are 
statistically  associated,  for  reasons  that  may  be 
either  irrelevant  or  closely  related  to  the  study,  an 
apparent  relationship  of  ano:her  variable  to  one  of 
them  may  result  from  an  actual  relationship  of  that 
variable  to  the  other,    (If  this  occurs,  we  are 
likely  to  speak  of  the  first  as  a  surrogate  for  the 
second,  and  to  try  to  uncover  the  effect  by  studying 
the  joint  relationship  of  our  response  with  both 
variables  or  sets  of  variables,) 

3.  Even  if  association  of  the  variables  we  are  studying 
with  some  "explanatory"  variable  is  firmly 
established,  this  establishment  cannot  of  itself 


^The  author  remembers  that  Professor  John  Tukey  wrote  this  section. 


228 


ERIC 


24o 


settle  the  question  of  causation  (though  strong 
evidence  would  be  provided  if  the  time  order  were 
known);  either  variable  may  "cause"  the  other,  or 
both  may  share  a  common  cause.    In  many  cases, 
continuing  studies  of  the  development  of  these 
variables  over  time  can  untangle  such  a  question  of 
"What  causes  what?"    In  the  present  case,  studies  of 
change  in  achievement  level  could  give  more  direct 
evidence  than  the  present  cross-sectional  survey. 

To  neglect  any  of  these  central  difficulties  is  to  lay 
oneself  open  to  very  serious  risk  of  error.    Yet,  to  fail  to  use 
such  evidence  in  making  judgments  and  taking  action  is  to  lay 
oneself  open  to  the  often  more  serious  dangers  of  unwarranted 
inaction,  or  of  action  based  mer^^ly  upon  rumor  and  ill-founded 
opinion.  We  must  recognize  and  de  1  with  the  three  difficulties, 
using  care  in  interpretation,  (emphasis  added) 


These  comments  are  as  relevant  to  the  NAEP  data  analysis  as  to  analyses 
performed  twenty  years  ago.  We  have  tried  to  ease  the  measurement  error 
problem.  The  other  problems,  surrogate  variables  and  causation,  are  still  a 
matter  of  concern  and  are  left  to  the  judgment  of  the  secondary  analyst  and 
reviewer. 

Computing  the  best  parameter  estimates  that  our  technology  allows  is 
costly  in  both  computer  expenses  and  conceptual  complexity.  In  the  next  two 
chapters,  the  technology  is  made  available  to  those  who  choose  to  do  so,  as 
we  did.  Ultimately,  however,  it  is  up  to  the  secondary  user  to  decide  the 
level  of  accuracy  he  or  she  needs  and  can  afford.  To  minimize  cost,  a 
secondary  user  might  use  one  plausible  value  as  if  it  were  a  test  score  and 
proceed  to  do  standard  analyses  using  available  statistical  systems.  We  do 
not  recommend  this  approach  because  it  will  often  lead  to  bias  and  to 
underestimates  of  sampling  error,  althougn  these  problems  may  be  no  worse 
than  those  incurred  using  the  data  from  other  complex  educational  surveys. 
Simple  analyses  may  be  sufficient  for  exploratory  data  analysis,  but  more 
complex  methods  may  be  necessary  for  better  results,  especially  when  high 
order  interactions  or  oartial  regression  coefficients  are  involved.  The 
next  several  chapters  contain  a  number  of  suggestions  for  improving  the 
accuracy  of  results  at  modest  increases  in  cost  and  complexity,  and  we  will 
continue  to  develop  methoJ"  of  handling  such  data  in  the  future. 

The  NAEP  database  contains  a  very  large  amount  of  information  about  a 
carefully  selected  national  sample.  Part  I  of  this  technical  report  reveals 
the  care  that  has  gone  into  making  the  database  as  clean  as  possible.  We 
wish  to  encourage  secondary  users  to  make  full  use  of  this  important 
informational  resource. 


229 

247 


9,1    The  Reading  Data  Analysis 


A  major  strategy  in  our  approach  to  the  reading  data  analysis  was  the 
introduction  of  scaling  to  reduce  the  available  information  into  a 
manageable  and  interpretable  form.  The  reading  data  of  the  Year  15 
assessment  are  quite  voluminous,  with  340  exercises  administered  over  three 
grade/age  levels.  Reporting  the  percentage  passing  each  exercise  for  each 
age,  grade,  gender,  race,  ethnicity,  region,  etc,,  seemed  to  us  too 
burdensome  for  the  potential  audiences  of  NAEP's  reports,  although  such 
information  is  available  for  those  who  are  interested.  It  also  seemed  to  us 
that  reporting  the  average  of  the  percentages  passing,  even  when  reported 
for  selected  subsets  of  exercises,  did  not  take  full  advantage  of  the 
information  available  in  the  data.  We  intended  to  investigate  whether  the 
data  fell  approximately  on  a  single  dimension,  and,  if  so,  to  use  item 
response  theory  (IRT)  methods  to  form  a  scale.  The  scale  would  then  be 
interpreted,  using  behavioral  anchoring,  to  make  fairly  general  statements 
about  what  students  could  and  could  not  do. 

The  reading  data  analysis  also  had  to  explore  the  effect  of  changing 
the  method  of  exercise  administration  from  tape  recorder  to  print.  To 
examine  the  change,  we  conducted  parallel  assessments:  one  by  tape 
recorder,  as  in  the  past,  and  one  using  a  printed  presentation.  If 
comparison  of  the  resultant  data  showed  that  effect  of  the  change  followed 
a  regular  pattern,  we  intended  to  equate  the  new  and  old  methods  of 
administration  and  develop  single  trend  lines  for  all  data  back  to  the 
first  reading  assessment  in  1970-71.  If  the  data  showed  substantial 
irregularity  between  the  two  data  sets,  we  would  present  the  results 
separately,  with  one  curve  showing  trend  from  1970-71  to  the  1983-84  tape 
administration  and  a  distinct  point  representing  the  1983-84  print 
administration,  the  beginning  of  the  future  trend  line. 

The  development  of  the  reading  scale  required  some  improvisation.  The 
ETS  proposal  assumed  that  a  block  of  reading  items  would  include  about 
twelve  scalable  reading  exercises  which  would  span  a  wide  range  of 
difficulty  so  that  few  students  would  be  able  to  answer  either  none  or  all 
of  the  exercises  correctly.    Because  many  students  would  be  given  two  or 
three  reading  blocks,  there  would  be  a  large,  random,  subsample  of  students 
who  responded  to  around  24  items.  Twenty-four  items  is  approximately  the 
number  of  exercises  usually  suggested  for  estimating  individual  performance 
using  the  maximum  likelihood  method.  However,  we  did  not  know  all  of  the 
properties  of  the  reading  exercises  at  the  time  of  proposal  writing  and, 
within  the  transition  time  constraint,  we  were  not  able  to  form  blocks  of 
exercises  that  met  the  "twelve  exercises  to  a  block  with  varying 
difficulty"  criterion.  Some  of  the  blocks  had  fewer  exercises,  some 
students  did  not  respond  to  all  the  exercises  offered,  many  students  were 
able  to  answer  all  exercises  correctly,  and  many  others  scored  less  well 
than  would  be  expected  by  chance.  The  total  effect  of  these  factors  was 
that  maximum  likelihood  estimates  of  reading  proficiency  were  attainable 
for  only  a  non-random  subsample  of  NAEP  students.  To  rectify  this 
situation,  marginal  estimation  procedures  were  used  to  estimate  a 
distribution  of  plausible  reading  proficiency  values  for  an  individual. 


230 


248 


This  procedure  resulted  in  much  more  complicated  methods  for  estimating 
national  parameters. 

The  activities  in  the  reading  data  analysis  are  described  below.  It 
should  be  noted  that  these  steps  were  not  always  performed  serially  but 
whenever  possible,  in  parallel.  We  moved  in  parallel  to  improve  the 
timeliness  of  results,  but  usually  at  additional  cost.  For  example,  the 
study  of  dimensionality  was  in  progress  at  the  same  time  as  the  scaling 
we  learned  more  about  our  data  and  methodology,  we  re-ran  previous  step 
In  logical  order,  the  major  steps  were  as  follows: 

(1)  The  dimensionality  of  the  spiral  reading  sample  was  explored. 
Unidimensionality  is  an  important  assumption  underlying  the 
scaling  methods  that  were  used.  Several  different  methods  of 
assessing  the  dimensionality  were  employed.  The  analyrjis 
showed  a  general  consistency  of  the  data  with  the  concept  of 
unidimensionality  and  no  compelling  reason  to  use  more  than 
one  dimension.  The  results  of  the  dimensionality  study  are 
presented  by  Zwick  in  Chapter  10.1. 

(2)  The  spiral  sample,  which  used  printed  instructions,  was 
compared  to  the  tape  sample,  which  used  a  tape  recorder  for 
administration.  For  the  tape  sample,  the  instructions  were 
read  aloud  on  a  tape  recorder,  but  the  actual  reading 
exercises  themselves,  obviously,  could  not  be  read  aloud, 
although  the  student  was  paced  through  the  exercises,  that  is, 
told  when  to  move  on  the  the  next  exercise.  For  this  reason, 
the  tape  sample  is  sometimes  referred  to  as  the  paced  sample. 
We  expected  the  effect  of  using  a  tape  recorder  to  be  regular 
and  small,  and  thus  equatable,  although  not  ignorable,  and  it 
was.  The  study  of  the  differences  between  printed  and 
tape-recorded  administration  is  discussed  by  Mislevy  in 
Chapter  10.3. 

(3)  The  maximum  likelihood  estimates  of  the  parameters  for 
selected  items  from  the  spiral  data  were  estimated  using  the 
LOGIST  program  (Vingersky,  Barton,  &  Lord,  1982).  First,  the 
data  were  fitted  for  each  grade/age  sample  separately,  and 
then  for  all  ages  and  grades  combined.  Investigation  showed 
that  the  differences  between  item  parameters  computed  over  all 
grade/age  samples  and  those  computed  separately  for  each  were 
negligible.  Individual  reading  scores  were  calculated  for  all 
students  who  were  presented  at  least  seventeen  items.  This 
estimation  procedure  is  described  in  detail  by  Vingersky  in 
Chapter  10.2. 

(A)  The  properties  of  these  estimates  were  explored  and  it  was 
found  that  finite  estimates  of  reading  ability  were  not 
available  for  about  15  percent  of  the  sample.  Furthermore,  the 
inability  to  compute  a  reading  score  was  associated  with 
various  background  and  attitude  questions;  for  example, 
different  ethnic  groups  had  different  rates  of  inestimable 


231 


scores.  To  respond  to  this  problem  and  to  make  reasonable 
population  estimates  from  these  reading  scores,  Winsorization 
was  tried,  but  did  not  seem  to  produce  satisfactory  results. 
Winsorization  is  discussed  in  Chapter  10.2. 

(5)  Because  of  the  problems  with  the  maximum  likelihood  estimates, 
the  parameters  were  re-calibrated  by  Bayesian  procedures  using 
the  BILOG  program  (Mislevy  &  Bock,  1982).  In  this 
re-calibration,  only  subjects  who  were  administered  at  least 
two  blocks  were  used.  The  item  parameters  generated  by  BILOG 
were  similar  to  the  LOGIST  parameters. 

Using  the  appropriate  item  parameters  and  the  students' 
responses  to  the  reading  exercises  and  several  background 
questions,  a  distribution  of  plausible  values  was  computed  for 
each  student.  This  distribution  represents  the  uncertainty 
involved  in  estimating  an  individual's  reading  proficiency: 
if  a  student  responded  to  rrany  exercises,  the  variance  of  this 
distribution  would  be  small;  if  the  students  responded  to  only 
a  few  exercises,  the  variance  would  be  large.    For  each 
student  who  was  presented  at  least  one  reading  exercise,  five 
plausible  values  were  randomly  selected  from  his  or  her 
distribution.  The  BILOG  scaling  is  discussed  by  Mislevy  and 
Sheehan  in  Chapter  10.3. 

Since  the  item  parameters  were  essentially  the  same  for  all 
three  grade/age  combinations,  the  single  reading  scale, 
spanning  all  ages  and  grades,  was  used. 

(6)  A  separate  item  calibration  was  performed  for  the  Year  15 
exercises  administered  using  a  tape  recorder  with  data  from 
past  assessments,  which  used  the  same  administrative 
procedures.  These  samples  were  available  only  for  age,  not 
grade,  samples.  The  data  for  each  age  were  calibrated 
separately. 

The  estimated  item  parameters  from  the  tape-administered 
sample  were  compared  with  those  computed  from  the  spiral  data 
which  vere  administered  by  print.  The  parameter  estimates  were 
reasonably  similar  and  so  the  estimates  from  the  tape 
administered  sample  were  equated  to  those  of  the  print  sample 
by  means  of  randomly  equivalent  spiral-  and  tape-administered 
samples  of  each  age  population.  This  item  calibration  is 
reported  in  detail  by  Mislevy  and  Sheehan  in  Chapter  10. A. 

(7)  Some  reading  data  from  the  past  assessments  in  1970-71, 
197A-75,  and  1979-80,  as  well  as  the  data  from  the  tape 
administered  sessions  of  the  1983-8A  assessment,  vere 
calibrated  together  using  the  BILOG  program.  The  public-use 
data  tapes  from  past  assessments  were  the  source  of  the 
student  responses  to  exercises.  Since  not  all  past  data  could 
be  linked  to  the  present  assessment,  only  exercises  that  were 


232 


23u 


also  used  in  1983-84,  and  other  exercises  in  the  same 
packages,  were  used.    A  separate  calibration  was  done  for  each 
grade/age.  The  details  of  this  analysis  are  reported  in 
Chapter  10.4. 

Since  the  reading  scale  is  not  appropriate  for  examining  the 
sub-area,  (literal  comprehension,  inferential  comprehension, 
and  study  skills)  that  past  reading  reports  have  carried,  the 
trends  were  continued  by  using  the  same  reporting  method  used 
in  the  past.  To  maintain  comparability  with  the  past,  only  the 
tape  administered  exercises  and  only  age,  not  grade,  eligible 
students  were  used  from  the  1983-84  data.    Only  items  that  had 
been  used  in  several  assessments  were  used.  The  details  are 
reported  in  The  Reading  Report  Card  (1985). 

(8)  The  plausible  values  were  then  transformed  to  the  NAEP  reading 
scale.  The  NAEP  reading  scale  takes  the  foim  of  an  estimated 
true  score  on  a  hypothetical  test  with  known  properties.  This 
hypothetical  test  has  500  open-ended  items,  and  all  item 
responses  are  assumed  to  follow  the  logistic  model.  The  items 
all  have  equal  discriminating  power  and  their  difficulties  are 
equally  spaced  across,  and  somewhat  beyond,  the  range  of 
student  performance.    Scores  on  this  test  can  range  between 

0  and  500. 

Several  points  on  the  reading  proficiency  scale  were  anchored. 
The  purpose  of  anchoring  the  scale  was  to  enhance 
interpretability  by  reporting  what  the  vast  majority  of 
students  at  ont  level  could  do  that  most  of  the  students  at 
lower  levels  could  not.  Several  scale  points  were  chosen  for 
anchoring:  150,  200,  250,  300,  ana  350.  Reading  exercises  were 
chosen  that  discriminated  between  these  reading  score  levels; 
the  rule  for  exercise  selection  was  that  at  least  80  percent 
of  the  students  at  one  level  could  answer  the  exercise 
correctly  whereas  less  than  50  percent  at  the  next  lower  level 
could.  The  selected  exercises  were  referred  to  a  committee  of 
reading  specialists  for  interpretation.  The  result  was  verbal 
descriptions  of  the  levels  of  reading  performance.  The  scale 
transformation  and  behavioral  anchoring  is  described  by  Beaton 
in  Chapter  10.5. 

(9)  The  performance  levels  of  students  at  ages  9,  13,  and  17  were 
then  computed.  Of  particular  importance  was  the  percent  of 
students  who  could  read  at  or  above  the  anchor  levels.  These 
percents,  and  all  other  reported  statistics,  were  computed 
using  the    plausible  values  for  reading.  Standard  errors  vere 
computed  using  the  jackknife  method.  The  methodology  for 
estimating  standard  errors  is  discussed  in  Chapter  13.2. 


233 


9.2    The  Writing  Data  Analysis 


The  writing  data  analysis  involved  (1)  an  initial  investigation  as  to 
whether  or  not  the  writing  data  could  support  scaling,  (2)  the  actual 
scaling,  and  (3)  analysis  of  the  scaled  scores.  The  ETS  proposal  for  the 
NAEP  did  not  propose  to  spiral  the  writing  data  with  the  reading  data  nor 
to  scale  writing,  and  we  did  not  intend  to.  However,  the  advantages  of  a 
summary  scale  that  simplified  age-to-age  comparisons  and  facilitated 
secondary  analyses  led  us  to  investigate  the  possibility  of  developing  a 
writing  scale,  and  we  did.  The  result  was  a  simple,  useful  scaling 
procedure  and  plausible  values  for  a  writing  scale. 

Scaling  writing  was  quite  different  from  scaling  reading.  First,  the 
writing  exercises  were  all  professionally  scored  and  assigned  a  rating 
between  zero  and  four,  whereas  the  reading  data  was  in  the  form  of,  or 
could  be  converted  to,  right/wrong  responses.  Oecond,  there  were  only  22 
writing  exercises,  as  compared  to  340  reading  exercises  (of  which  228  were 
used  in  the  reading  scale),  and  only  ten  exercises  were  spiralled  so  that 
their  inter-correlations  could  be  estimated.  Finally,  most  students 
responded  to  only  one  or  two  exercises,  and  no  one  was  administered  more 
than  four  exercises.  Thus,  because  of  the  non-binary  nature  of  the 
exercises  and  the  limitation  on  the  amount  of  individual  information,  the 
well-developed  item  response  theory  that  was  applicable  for  reading  proved 
inappropriate  for  writing. 

The  writing  scale  that  we  used  is  an  extension  of  an  idea  presented  by 
Goldstein  and  James  (1983).  The  object  of  the  assessment  was  to  provide  an 
estimate  of  how  the  population  of  students  would  have  done,  on  the  average, 
if  all  students  were  administered  all  of  the  ten  essays  triat  were 
spiralled.  To  reach  this  goal,  we  used  the  information  available  from  a 
student's  actual  responses  to  estimate  how  he  or  she  would  have  done  if 
administered  all  essays.  The  estimates  for  individuals  involved  some 
uncertainty  which  was  incorporated  into  the  plausible  values  for  writing 
and  thus,  ultimately,  into  the  estimates  of  standard  errors. 

The  steps  in  the  writing  analysis  were  performed  in  parallel  wherever 
possible,  as  in  the  reading  analysis.  The  major  stepc  were  as  follows: 

*  Examining  the  rater  reliability  and  computing  basic  statistics 
of  the  writing  data.  A  random  sample  of  20  percent  of  the 
papers  were  scored  twice  by  indep'iindent  graders.  The  scorer 
reliability  was  computed  separately  for  various  essays  and 
grade/age  combinations.  The  average  reliability  was  about  .90. 
The  means,  standard  deviations,  and  inter-correlations  among 
the  spiralled  essays  and  selected  background  and  attitude 
questions  were  computed  for  each  giade.  The  results  are 
discussed  by  Beaton  in  Chapter  11.1. 

*  Comparing  the  spiral  and  tape  results.  In  \«riting,  unlike 
reading,  the  actual  question,  as  well  as  the  assessment 
instructions,  could  be  read  to  the  students  in  the  tape 
sample.  Substantial  differences  in  the  distributions  of  item 


234 


252 


responses  were  found  between  those  students  who  were 
administered  the  exercises  using  a  tape  recorder  and  those  who 
were  required  to  read  the  question.  The  details  are  presented 
by  Johnson  in  Chapter  11.2. 

Analyzing  the  trend  data.  Since  we  did  not  feel  that  we  could 
equate  the  spiral  and  tape  results,  we  used  only  the  tape 
results  in  analyzing  trends.  There  were  only  a  few  essay 
exercises  that  were  used  in  the  past  and  offered  in  the  tape 
sessions,  and  these  were  analyzed  individually  to  produce  the 
trend  report.  The  details  are  presented  by  Johnson  in  Chapter 


Scaling  the  writing  data.  Some  of  the  essay  exercises  were 
administered  at  several  age  and  grade  levels,  and  the  same 
scoring  protocols  were  used,  regardless  of  the  ages  or  grades. 
The  inter-correlations  at  the  three  grades  were  compared  and 
found  to  be  not  significantly  different.  The  three 
inter-correlation  matrices  were  then  pooled  to  make  one 
correlation  matrix. 


Using  this  correlation  matrix  and  the  responses  of  an 
individual  student,  an  estimate  of  that  student's  average 
performance  on  all  ten  essays,  and  its  standard  error,  was 
made.  Assuming  a  normal  distribution  o'  error,  five  random 
values  were  selected  from  this  distribution  of  plausible 
scores  for  that  student. 


The  writing  scale  can  be  labeled  in  the  same  way  as  the  essays 
that  It  contains.  The  descriptions  of  levels  of  proficiency 
were  the  same  for  all  escays;  there  were  five  ordered 
categories:    a  zero  was  no  response,  one  was  unsatisfactory, 
two  was  minimal,  three  was  satisfactory,  and  four  was 
elaborated.    The  common  labeling  for  exercise  responses 
automatically  gave  us  a  labels  for  scale  points,  but  we  found 
this  implicit  anchoring  to  be  unhelpful,  since  the  scale 
scores  had  a  substantially  smaller  variance  than  the 
individual  essays;  thus,  no  students  in  the  sample  had  scale 
scores  of  four.  The  details  are  presented  by  Beaton  and 
Johnson  in  Chapter  11. A. 

For  the  cross-sectional  report  The  Writing  Report  Card 
(Applebee,  Langer,  &  Mullis,  1986b),  the  average  values  for 
the  different  grade  groupings  and  for  demographic  groupings 
were  computed.  Results  were  computed  for  each  plausible  value 
and  the  average  result  was  used  for  reporting.  All  results 
were  reported  with  their  standard  errors,  which  were  computed 
using  the  jackknife  method.  The  methods  are  discussed  in 
Chapter  13.2. 


235 


9,3    Background  and  Attitude  Analyses 


Analysis  of  the  background  and  attitude  data  has  been  largely 
restricted  to  the  basic  variables  used  in  the  report.  Since  trend  analyses 
are  restricted  to  variables  used  over  time,  these  reporting  variables  are 
those  used  by  ECS  in  past  assessments.    The  detailed  definitions  are 
described  in  Chapter  12. 

The  racial/ethnic  categorization  has  resulted  in  some  detailed  study 
which  has  been  reported  by  Rivera  and  Pennock-Roman  (1985).    Since  the 
first  assessment  in  1969,  NAEP  has  asked  its  administrators  to  note  the 
races  of  the  students  on  the  student  listing  form,  hence  the  variable 
called  "observed  ethnicity."     At  first,  students  were  classified  only  as 
black  or  white;  in  Year  3  (1971-72),  the  Hispanic  classification  was  also 
observed.    However,  the  small  sample  size  for  Hispanics  precluded  the 
creation  or  a  separate  reporting  category  until  Year  11  (1979-80).    In  Year 
7  (1975-76),  NAEP  began  to  ask  each  Age  17  student  to  report  his  or  her  own 
race  or  ethnicity,  hence  we  also  have  "self-reported  ethnicity." 
Self-reporting  of  race/ethnicity  was  added  for  Age  13  in  Year  11  (1979-80) 
and  for  Grade  4/Age  9  in  Year  15  (1983-84).    After  extensive  study  of  the 
differences  between  the  two  definitions  of  race/ethnicity,  the  observed  and 
self-reported  race/ethnicities  were  combined  into  "imputed  race/ethnicity." 
For  trend  reports,  observed  ethnicity  was  used,  as  in  the  past. 

For  The  Writing  Report  Card,  the  background  and  attitude  questions 
pertaining  to  writing  were  scaled  using  a  variation  of  the  ARM  method, 
which  was  used  for  the  writing  exercise  data. 

The  definitions  of  the  background  and  attitude  variables  are  discussed 
in  Chapter  12. 

9.4    Parameter  Estimation 

After  the  reading,  writing,  and  background  and  attitude  data  were 
readied,  the  estimation  of  the  competencies  of  students  in  American  schools 
began. 

The  programs  for  parameter  estimation,  as  well  as  many  of  the  programs 
for  data  base  creation  and  the  analysis  of  reading  and  writing  data,  were 
written  in  F4STAT,  the  ETS  proprietary  statistical  system  (see  Beaton, 
1964;  Beaton,  1973;  and  Educational  Testing  Service,  1984).    F4STAT  is  a 
system  of  procedures  for  use  with  FORTRAN  programs;  the  procedures  include 
subroutines  for  data  input  and  handling,  matrix  manipulation,  statistical 
estimation,  probability  calculations,  and  output  formatting  as  well  as  many 
other  general  purpose  service  routines.  The  procedures  are  assembled  in  an 
efficient  manner  for  specific  data  analytic  purposes.  Although  most,  if  not 
all,  of  the  calculations  done  here  could  be  computed  using  publicly 
available  sof*^ware,  their  costs  and  demands  for  machine  resources  might 
make  these  calculations  prohibitively  expensive. 


236 


254 


First,  the  sampling  weights  were  computed  by  Vestat.  In  Chapter  13.1, 
Johnson,  Hansen,  Tepping,  Lago,  and  Burke  describe  in  detail  how  the 
weights  were  computed.    Sampling  weights  were  initially  derived  from  the 
sampling  design,  then  adjusted  to  account  for  nonresponse  and  trimmed  to 
reduce  sampling  variance.    Then,  NAEP,  Current  Population  Survey  (CPS),  and 
Census  estimates  of  population  sizes  were  combined  into  an  optimum  estimate 
of  size  for  a  number  of  subpopulations.  Weights  were  computed  for  students, 
the  teachers  of  these  students,  and  schools. 

Next,  parameters  for  the  nation  as  a  whole  and  for  specified 
subpopulations  were  estimated.  The  jackknife  method  was  used  to  estimate 
the  sampling  error  of  the  parameter  estimates.  Thirty-two  synthetic  samples 
were  formed  from  the  64  PSUs  in  the  sampling  design,  and  separate  student 
weights  were  computed  for  each  of  those  synthetic  samples.    The  original 
sample  weight  was  used  for  parameter  estimation  and  the  synthetic  samples 
were  used  for  estimating  sampling  error.  The  details  are  covered  in  Chanter 
13.2.  ^ 

Another  form  of  variability  in  parameter  estimation  comes  from  the 
uncertainty  involved  in  the  imputation  of  plausible  values.  Mislevy 
discusses  this  uncertainty,  its  estimation,  and  the  use  of  plausible  values 
in  Chapter  13.3. 

Finally,  although  many  different  statistics  have  been  computed  for 
various  reports,  certain  simple,  basic  statistics  are  of  such  general  value 
that  we  hive  computed  them  routinely  and  made  them  available  to  the  NAEP 
staff  for  exploration,  interpretation,  and  reference.  Tabulating  these 
simple  statistics  has  resulted  in  many  volumes  of  statistical  tables  which 
are  sometimes  referred  to  as  almanacs.  Tables  cover  both  Year  15  and  trend 
data.  In  Chapter  13.4,  Zwick  describes  the  basic  tables,  their  contents, 
and  their  use. 


9.5    Supplementary  Studies 

The  Year  15  data  analysis  has  entailed  several  supplementary  studies 
which  are  reported  here. 

The  ETS  Standards  for  Quality  and  Fairness  requires,  among  other 
things,  the  study  of  the  validity  of  reported  results.    Applying  the  usual 
methods  used  for  individual  testing  when  the  results  are  used  only  for 
groups  of  persons  is  inappropriate.    Zwick  describes  how  the  content  and 
construct  validity  of  the  reading  and  writing  data  were  evaluated  in 
Chapter  14.1.    The  study  showed  that,  in  general,  the  content-  and 
construct-related  evidence  were  supportive  of  the  validity  of  the  Year  15 
NAEP  reading  and  writing  assessments. 

Because  using  the  jackknife  is  somewhat  cumbersome  for  secondary 
analysts,  we  computed  design  effects  for  a  number  of  parameter  estimates. 
The  design  effect  is  a  measure  of  the  difference  in  efficiency  in  parameter 
estimation  between  a  complex  sampling  design  and  a  simple  random  sample, 
and  can  be  used  to  simplify  analysis  procedures  by  achieving  approximate 


237 


results.  Note  that  we  have  used  the  jackknife,  not  design  effects,  in 
NAEP  analyses;  the  design  effects  are  for  secondary  analysis.  In  our 
opinion,  the  NAEP  design  effects  were  found  to  be  reasonably  small  fo 
type  of  survey.  Johnson  provides  the  details  in  Chapter  1A.2. 


238 


Chapter  10 
THE  READING  DATA  ANALYSIS:  INTRODUCTION 


Robert  J.  Mislevy 
Educational  Testing  Service 


This  chapter  describes  the  analyses  carried  out  on  responses  to 
cognitive  exercises  in  the  Year  15  NAEP  reading  assessment  leading  to  the 
results  that  appear  in  The  Reading  Report  Card:    Progress  Toward  Eyrellence 
in  Our  Schools  (1985).    The  emphasis  is  on  item  response  theoretic  (IRT) 
scaling  procedures,  an  innovation  to  NAEP  beginning  with  the  learning  area 
of  reading  in  the  Year  15  assessment.    This  introductory  chapter  outlines 
general  arguments  for  scaling,  and  discusses  the  special  cnallenges  that 
arise  in  the  NAEP  setting.    Subsequent  chapters  detail  the  methods  and 
results  of  specific  procedures.    Brief  summaries  of  these  chapters  follow. 

Chapter  10.1  -  Dimensionality  of  cognitive  reading  exercises 

It  is  a  strong  assumption  to  posit  an  IRT  model  in  which  a 
single  examinee  characteristic  explains  for  responses  to  all 
items.    This  chapter  describes  analyses  performed  on  the  Year  15 
BIB  spiral  data  that  explored  the  extent  to  which  this  assumption 
of  unidimensionality  is  satisfied  tor  the  items  included  in  the 
reading  scale. 


Chapter  10.2  -  Joint  estimation  procedures 

The  ETS  proposal  for  the  analysis  of  the  Year  15  reading  data 
specified  procedures  based  on  the  estimation  of  proficiency 
variables  for  each  respondent.    This  chapter  describes  the 
analyses  performed  to  this  end,  and  documents  the  evidence  for 
concluding  the  results  were  unsatisfactory. 


Chapter  10.3  -  Marginal  estimation  procedures 

Alternative  IRT  procedures  that  do  not  require  precise  point 
estimates  of  individual  examinee  parameters  are  described  in  this 
section.    These  procedures  include  marginal  estimation  of  item 
parameters  and  population  characteristics,  and  "plausible  values" 
associated  with,  but  not  estimates  of,  individual  examinees' 
proficiencies.    Also  described  here  is  the  equating  of  the 
responses  gathered  under  BIB  spiral  conditions  in  Year  15  to  those 


239 


gathered  under  paced  conditions  in  previous  NAEP  assessments  and 
in  the  Year  15  pao2  bridge  sample. 


Chapter  10.4  -  Trend  data 

This  section  describes  the  extension  of  the  IRT  reading 
scale,  defined  originally  on  Year  15  BIB  spiral  data,  to  the  paced 
data  of  the  previous  NAEP  assessments. 

Chapter  10.5  -  Scale  definition  and  behavioral  anchoring 

This  section  details  Ihe  procedures  by  which  results  on  the 
IRT  reading  scale  v^re  related  to  expected  performance  on  specific 
reading  tasks. 

10.0.1    Item  Response  Theory 

Item  response  theory  (IRT)  provides  a  mathematical  model  for  the 
probability  that  a  particular  examinee  will  make  a  correct  response  to  a 
particular  item,  in  terms  of  a  parameter  reflecting  the  examinee's 
proficiency,  and  one  or  more  parameters  characterizing  features  of  the  item 
such  as  its  difficulty  and  reliability  (Lord,  1980).    As  an  example,  the 
three-parameter  logistic  IRT  model  (the  model  used  in  the  NAEP  reading 
assessment)  takes  the  following  form: 

P(x    =l|e  ,a  ,b  ,c  )  =  c    +  (1-c  )/{Uexpl-1.7a  (9  -  b  ))}, 

13^333  3  3  313 

where 

X      is  the  response  of  pupil  i  to  item  j,  1  if  correct  and  0  if 
Incorrect, 

9      is  the  (unobservable)  ptoficiency  of  pupil  i, 

a      is  the  slope  parameter  of  item  j,  characterizing  its 
sensitivity  to  proficiency, 

b      is  its  threshold  parameter,  characterizing  its  difficulty, 

c      is  its  lower  asymptote  parameter,  reflecting  possibly 
non-zero  chances  of  correct  response  from  even  persons 
of  very  low  proficiency,  nd 

1.7  is  a  scaling  constant. 

The  curve  traced  by  this  function  for  a  given  item  as  9  runs  over  its 
range  is  referred  to  as  an  "item  response  curve."    A  domain  of  items  over 


240 


ERLC 


256 


which  performance  is  modeled,  and  the  accompanying  proficiency  variable 
are  often  jointly  referred  to  as  a  "scale."  P^o^ciency  variable, 

ncv.hJr  ^^l  effectively  revolutionized  measurement  in  education  and 
n?Se  Jc^L"''  the  advantages  it  offers  over  traditional 

true-score"  or  "number-right"  test  theory.    Of  particular  note  are  (i^  itc 
capacity  to  provide  comparable  measurements  f rom'dif fer^n    ??em  sets 
without  expensive  equating  studies,  (ii)  its  flexibility  to  administer 
examinees  sets  of  items  that  are  tailored  to  their  proficiency    eJe  s  and 
(ill)  ^ts  ability  to  yield  scores  that  can  be  interpreted  in  Wms  of 
expected  behavior  on  every  item  in  the  scale. 

^?  date,  applications  of  IRT  have  been  limited  for  the  >ost  oart  to 
aSmin?  tirLl  ^"    ^J^^^  measurement.    Th.t  is,  each  ini^vidu^  Is 
administered  a  sufficient  number  of  items  to  provide  a  precise  estimate  of 
his  or  her  (unobservable)  proficiency  parameter,  an  estimate  thafs  hen 
used  in  subsequent  decision-making  or  secondary  analyss      It  has  been 
argued,  however,  that  the  advantages  mentioned'in  the  precedng  paragraph 
hold  promise  for  the  assessment  setting  as  well  (Bock,  Mislevy    l  loodTn 
1982;  Messlck    Beaton,  &  Lord,  1983).  despite  the  fac    tha  InUJesrSies 
TJrV.^i  prof  ciencies  of  individual  examinees  but  in  profic  ency 
aid  ^Je?    r2LH''\"/"'^«"'K'  P°P"l-^i°"s.  the  population'  trends  over  tim., 
SkgJ:;n^d^:;a~  P^^^gog^cally  and  socially  relev/' 

10.0.2    l.:cm  Response  Theory  and  Educational  Assessment 

rpnnrn!  interest  in  IRT  for  NAEP  was  dissatisfaction  with  the 

reporting  methods  that  had  evolved  prior  to  the  Year  15  assessmen  When 

e^H  '*^f  P^^"         ^°  ^^P°^^         e^^h  individual  Uem  t^e 

estimated  proportion  of  correct  responses  from  a  population  or  a 

andT  hi""-  '5''  ^j"«l-item  reporting  .uickly'pfoved  ?oo  cumbersome, 
and  by  the  second  reading  assessment  NAEP  reported  averages  of  estimafPd 
percents-correct  for  sets  of  related  items.    ComparLons  over  tl^e  or 
uTrLT^  '^^^^  "^"""^i"  percents-correct"  were  necessarily 

limited  to  sets  of  items  common  to  all  groups  involved  in  the  comparison  a 
limitation  strongly  felt  as  the  NAEP  item  pool  evolved  over  ti^r  th^s 
ln?prnr!/^'  ""-"^f       ! ^ems  by  which  trends  could  be  estimated 
interpretations  of  domain  percents-correct  were  limited  as  well.  sine, 
generalizations  to  different  items  sets  or  implication'  ?or 
par  icular  items  are  not  forthcoming.    Finally,  because  different 
individuals  were  administered  different  items  under  NAEP's  maUix  samolincr 
design    nothing  comparable  to  traditional  test  scores  was  oS?a  Jed  t^' 
IXllT.  ::r^?abi:r"^'^^^^       relationships  among  proficiency  and^other 

Three  objectives,  then,  were  established  for  the  use  of  Trt  in  NAEP: 

(1)  Results  should  be  summarized  in  a  manner  which  wo,.^  i 

facilitate  comparisons  ov^t  time  and  across  subpc  ations 
(including  different  ages  and  grades),  despite  th      act  that 


241 


ERIC 


different  item  sets  were  administered  to  different  targeted 
comparisons  groups. 

(2)  Results  should  be  reported  on  a  scale  that  could  be 
interpreted  in  terms  of  expected  behavior  on  tasks  involving 
reading. 

(3)  Secondary  users  should  be  provided  results  in  a  form  that 
facilitates  analyses  of  the  relationships  among  reading 
proficiency  and  examinee  characteristics,  such  as 
instructional  experiences  and  demographic  data. 

The  original  intention  was  to  accomplish  these  objectives  by  estimating 
each  sampled  student's  proficiency  variable  on  an  IRT  scale.  Distributions 
of  these  estimates  would  be  taken  as  approximations  of  the  latent 
proficiencies  themselves,  both  for  NAEP  reports  and  for  secondary  analyses. 
As  documented  in  Chapter  10.2,  however,  this  approach  proved 
unsatisfactory,  mainly  because  most  pupils  responded  to  too  few  cognitive 
exercise  to  provide  precise  point  estimates  of  their  latent  proficiency 
variables.    More  complex  methods  that  could  provide  estimates  of  population 
characteristics  without  estimating  values  for  individual  respondents  had  to 
be  developed. 

Anticipating  and  summarizing  the  contents  of  the  chapter,  the  new 
methodologies  accomplished  objectives  1  and  2  in  full.    Objective  3, 
providing  useful  data  for  secondary  analysts,  is  satisfied  to  a  large  but 
incomplete  extent.    The  procedures  that  would  be  required  to  support  all 
conceivable  secondary  analyses  of  NAEP  data,  to  the  level  of  accuracy 
inherent  in  the  data,  turn  out  to  be  beyond  •'he  reach  of  present  (and 
indeed,  foreseeable)  resources.    The  procedures  described  in  Chapter  10.3 
do  however  possess  the  properties  of  (i)  yielding  consistent  estimates  on 
the  IRT  scale  for  results  related  to  the  traditional  NAEP  reporting 
variables,  (ii)  providing  approximate,  though  sub-optimal,  results  for 
other  background  variables  (potential  biases  of  15  to  40  percent  in  certain 
regression  coefficients,  for  example),  and  (iii)  laying  the  methodological 
foundation  for  improved  estimation  of  background  effects  in  future  WAEP  IRT 
analyses  (reducing  potential  biases  to  a  Juaximum  of,  say,  5  percent  for  a 
broad  range  of  policy  analyses). 

Two  points  merit  emphasis  here.    First,  all  analyses  that  could  be 
carried  with  past  NAEP  data  can  still  be  carried  out  with  equal  or  greater 
precision  with  the  Year  15  data.    Because  item  responses  are  provided  on 
public-use  data  tapes,  nothing  is  lost  to  the  secondary  analyst  by  the  fact 
that  some  results  are  reported  on  an  IRT  scale. 

Second,  the  biases  mentioned  in  the  preceding  paragraph  are  not 
shortcomings  of  our  procedures  but  of  limitations  inherent  in  the  data, 
namely  the  sparseness  of  information  about  individual  respondents.    When  il 
is  desired  to  draw  inferences  from  results  on  specific  items  to 
proficiencies  of  a  more  general  nature,  the  biases  of  ^^errors  in  variables" 
problems  arise.    Typically,  because  they  are  difficult  to  deal  with  and  are 


242 


ERIC 


2b0 


not  well  understood  in  the  educational  research  community,     they  are 
Ignored  (as  in  analyses  of  the  High  School  and  Beyond).    This  standard 
practice  would  prove  disastrous  for  trend  analyses  in  NAEP  data,  since  the 
i!lp7c'"^    ^-^^^"^^^  "'^^        data-gathering  design;  the  variation  of 
NAEP  s  sampling  design  over  time,  due  in  part  to  varying  levels  of  funding, 
would  render  uselejs  any  trend  analyses  that  ignored  these  effects.  The 
innovations  described  in  Chapters  10.3  through  10.5,  however,  open  the  door 
^arr?5Js    ^"^"^^f";.-"-ly^^s  based  on  IRT  (e.g..  The  Readlng^Report 
tara,  19a!)),  and  in  which  errors  in  variables  are  handled  appropriately. 

provides  more  powerful  analyses  than  percent-correct  reporting  in 
large  degree  because  it  makes  more  assumptions  about  relationships  amone 
examinees'  expected  responses  to  items.    The  original  justiJJcaSion  ?o? 
Up^^iyLc'"^  Peycents-correct,  for  example,  was  the  fact  that  each 

Item  otters  some  unique  information  about  trends  and  population 
comparisons.  Nonetheless,  trends  or  comparisons  based  on  each  of  several 

]rZl    -T/t^  K^""^  "^^^  ^'^^i^i^  similarities-most  geometry 

Items  might  be  becoming  easier  over  time,  for  instance,  while  most  algebra 

^^^hrf-^    """""T  difficult.     Fitting  one  unidimensional  IRT  model  to 

algebra  Items  and  another  to  all  geometry  items  will  capture  these 
con.T.onalities,  operationally  defining  the  latent  "algebra"  and  "geometry" 
proficiency  variables  in  terms  of  the  similarities  of  patterns  of  the  items 

In  3  SC3XG • 

The  cost  of  using  the  IRT  models  is  the  loss  of  information  about 
differences  among  the  patterns  of  items  within  a  scale.    If  both  algebra 
and  geometry  were  modeled  by  a  single  scale  in  the  example  above,  for 
instance,  the  IRT  single-variable  summary  would  not  appropriately  reflect 
the  differential  changes  of  items  of  the  two  types.    Technically,  model 
mis-specification  errors  of  this  type  are  referred  to  as 
multidimensionality"  or  "lack  of  local  independence."    (f  .e  Goldstein 
1980,  and  Traub  and  Wolfe,  1981,  for  insightful  discussions  ot\t  threat 
such  errors  pose  to  the  use  of  IRT  in  educational  assessment.)  Similarly 
los    will  be  differential  patterns  of  performance  on  the  items  withir  a 
scale  for  reasons  of  (i)  differing  curricula  or  teaching  styles  over 
schools,  (11)  changes  in  curricular  emphasis  over  time,  and  (iii)  regional 
or  ethnic-group  differences.  i-cKiunuj. 

This  line  of  reasoning  leads  to  three  important  conr  jsions. 
urst.  It  IS  clear  that  summaries  of  assessment  data  in  terms  of  IRT 
variables  merely  reflect  the  dominant  patterns  recurring  within  a  much 
i^b  "''5^''  ^^^^  best  they  serve  as  summary  indicators 

like  the  Gross  National  Product  or  the  Consumer  Price  Index:  they  will 
undoubtedly  prove  inadequate  for  more  subtle  analyses  that  demand 
Pfff^^^"^f^i  information  among  items  within  scales,  for  comparing  detailed 
effects  of  teaching  methodologies  or  for  analyzing  item  performance  in 
terms  of  specific  skill  components  demanded  by  particular  exercises 
n!JJ"™!?A  ^^^.^'^^"'Ple'.Haertel's  [19841  use  of  latent  class  models  to  study 
NAEP  mathematics  exercise  in  terms  of  the  skills  they  demand.)  IRT 
proficiency  variables  may  be  justified  by  their  usefulness  as  a  data 
reduction  technique,  but  it  must  be  borne  in  mind  that  they  are  not  founded 
strictly  in  accordance  with  either  pedagogical  or  psychological  theories 

243 


ERIC 


about  the  skills  examinees  bring  to  bear  upon  the  exercises  they  are 
presented. 

Second,  because  IRT  variables  are  defined  operationally  within  scales, 
pedagogical  and  psychological  theories  must  play  a  role  in  determining  the 
domains  of  items  that  will  be  scaled  together.    Because  differential 
patterns  within  a  domain  will  not  be  reflected  by  the  IRT  results,  scaling 
should  be  carried  out  within  domains  for  which  broad  summaries  are  sensible 
and  policy-relevant.    These  decisions  must  be  theory-driven  as  well  as 
data-driven  (see  paragraph  below).    For  this  reason,  "study  skills"  tasks 
requiring  declarative  knowledge  were  eliminated  from  the  domain  that  became 
the  basis  of  the  NAEP  reading  scale.    This  focused  the  analysis  on  the  more 
generalized  skills  commonly  thought  of  as  reading  per  se,  among  which 
different  curricula  or  backgrounds  were  less  likely  to  impose  strong 
differential  patterns  of  item  performance. 

Third  and  finally,  the  burden  thus  falls  upon  those  who  propose  to  use 
IRT  in  educational  assessment  to  demonstrate  (and  to  continue  to 
demonstrate  over  time)  that  the  domains  of  items  within  which  they  carry 
out  IRT  scaling  are  in  fact  capturing  relevant  patterns  of  change.  This 
must  be  done  by  examining  what  are  in  a  broad  sense  residuals  from  the  IRT 
models:  for  example,  factor  analyses  of  items  within  scales,  analyses  of 
residuals  from  fitted  item  response  curves,  and  examinations  of  the 
stability  of  item  response  curves  over  time.    (Analyses  of  this  type  are 
described  in  Chapters  10.1  through  10. A.) 


Chapter  10.1 


ASSESSMENT  OP  THE  DIMENSIONALITY 
OF  NAEP  YEAR  15  READING  DATA^ 

Rebecca  Zwick 
Educational  Testing  Service 


lO'l'l    The  Unidimensionality  Assumption  in  Item  Response  Theory 

To  determine  whether  it  was  reasonable  to  regard  the  reading  items 
administered  in  the  Year  15  NAEP  data  collection  as  measures  of  a  single 
construct,  a  series  of  analyses  of  the  dimensionality  of  the  reading  data 
was  performed.  Dimensionality  analyses  were  conducted  both  within  and 
across  the  three  grade/ages,  A/9,  8/13,  and  11/17.    It  was  important  to 
investigate  the  dimensionalit    issue  because  the  validity  of  the  item 
response  theory  (IRT)  model  ubed  to  estimate  reading  proficiency  in  the 
1983-198A  NAEP  survey  rests  on  the  assumption  of  unidimensionality.  It 
should  be  noted,  however,  that  regardless  of  whether  an  IRT  model  is  used, 
it  is  ordinarily  assumed  that  items  on  an  achievement  test  can  be  treated 
as  measures  of  a  single  dimension,  in  this  case,  reading  proficiency. 
Scoring  a  test  by  simply  summing  the  item  scores  involves  an  implicit 
assumption  of  unidimensionality;  IRT  scaling  formalizes  this  assumption. 

The  reading  data  were  analyzed  using  the  three-parameter  logistic  model 
(Birnbaum,  1968;  Lord,  1980)  in  which  P . . ,  the  probability  that  subject  \ 
gets  item  j  correct  can  be  expressed  as^iollows; 


1 


P  ,  =  P(x.    =  i|e)  =  c,  +   ni 

3  ^3  *  3        -  -1  .  7a    (e      -    b  ) 


1  +  e  D     1  3 


where  is  the  proficieno.y  parameter  for  person  i,  a  is  the  item 
discrimination  parameter,  b^  is  the  item  difficulty,  and  c.  can  be 
interpreted  as  the  probability  that  a  person  with  very  low^ability  gets 


The  author  thanks  Albert  Beaton,  Bruce  Bloxom,  Darrell  Bock,  Neil 
Dorans,  Paul  Holland,  Robert  Hislevy,  Paul  Rosenbaum,  and  Ledy:.rd  Tucker, 
who  provided  consultation  and  comments;  Dick  Harrison,  Bruce  Kaplan,  and 
Dorothy  Thayer,  who  programmed  the  analysis  procedures;  and  Natalie  Roca, 
who  conducted  analyses  and  literature  reviews.    An  earlier  version  of  this 
chapter  is  available  as  ETS  Research  Report  86-A. 


245 


item  j  correct.  (Model  parameters  were  estimated  using  BILOG  [Mislevy  & 
Bock,  1982 J;  details  are  provided  in  Chapter  10.3.)    In  applying  a  model  of 
this  kind,  it  is  assumed  that  the  only  examinee  characteristic  that  affect 
item  response  is  a  single  latent  variable,  9. 


10.1.1.1    Robustness  of  IRT  Estimation  Procedures 

In  practice,  the  assumption  of  unidimensionali ty ,  required  for  the 
application  of  conventional  IRT  models,  will  always  be  violated  to  some 
degree.    To  make  a  more  objective  determination  as  to  what  constitutes  an 
important  departure  from  unidimensionali ty,  we  need  to  know  more  about  the 
robustness  of  the  IRT  estimation  procedures  to  violations  of  the 
unidimensionali ty  assumption.  Although  there  has  been  little  theoretical 
work  in  this  area,  some  empirical  studies  have  been  conducted.  Reckase 
(1979)  and  Drasgow  and  Parsons  (1983)  investigated  the  results  of 
estimating  the  three-parameter  logistic  model,  using  LQGIST  (M.  S. 
Wingersky,  1983)  under  violations  of  the  unidimensionali ty  assumption. 
(The  one-parameter  and  two-parameter  logistic  models  were  also  examined  by 
Reckase,  1979,  and  Drasgow  and  Parsons,  1983,  respectively.)    Reckase' s 
study  was  based  on  five  actual  data  sets  and  five  data  sets  constructed  to 
have  specific  factor  structures.    He  concluded  that  LQGIST  estimates  "the 
first  principal  component  when  it  is  large  relative  to  other  factors  .... 
good  ability  estimates  can  be  obtained  ...  even  when  the  first  factor 
accounts  for  less  than  10  percent  of  the  test  variance,  although  item 
calibration  results  will  be  unstable.    For  acceptable  calibration,  the 
first  factor  should  account  for  at  least  20  percent  of  the  test  variance" 
(p.  228).    Drasgow  and  Parsons  (1983)  made  use  of  a  hierarchical  model  with 
a  general  latent  trait  as  well  as  five  group  factors  to  simulate  various 
kinds  of  latent  structures.    One  of  their  conclusions  was  that,  in  the 
simulated  data  designed  to  resemble  "moderately  heterogeneous  achievement 
tests  and  attitude  assessment  instruments"  (p.  193),  LOGIST  still  recovered 
the  latent  trait  and  provided  acceptable  estimates  of  the  item  parameters 
(p.  198).    There  is  no  reason  to  believe  that  the  effects  of  multi- 
dimensionality  on  BILOG  (Mislevy  &  Bock,  1982),  which  was  used  to  scale  the 
NAEP  data,  would  differ  from  the  results  obtained  with  LOGIST  (Mislevy, 
personal  communication,  October  1985).    These  findings  suggest  that  IRT 
scaling  procedures  can  produce  satisfactory  results  under  moderate 
departures  from  unidimensionali ty. 


10.1.2.    Methods  of  Dimensionality  Assessment  for  Dichotomous  Data 

Th-^  traditional  psychometric  approach  to  the  assessment  of 
dimensionality  is  through  factor-analytic  methods.    Factor  analysis  often 
produces  satisfactory  results  when  each  of  the  variables  is  the  score  on  a 
multi-item  test.  When  each  of  the  measures  is  the  response  to  a 
dichotomously  scored  item,  however,  it  is  now  well  known  that  linear  factor 
analysis  of  Pearson  (phi)  correlations  does  not,  in  general,  yield  a 
correct  representation  of  the  dimensionality  of  the  item  pool  (see,  e.g., 
Carroll,  1945,  1983;  Hulin,  Drasgow,  &  Parsons,  19&3;  McDonald  &  Ahlawat, 
1974;  Mislevy,  1986c).    The  fundamental  problem  is  that  in  computing  phi 


246 


correlations,  item  responses  are  treated  as  true  dichotomies.    In  applying 
a  linear  factor  analysis  model,  we  are  hypothesizing  that  dichotomous 
variables  are  linear  combinations  of  continuous  latent  variables  with 
infinite  range,  a  mathematical  impossibility.    In  fact,  the  regression  of  a 
dichotomous  item  on  a  continuous  latent  variable  must  be  nonlinear.  The 
best  linear  approximation  to  the  nonlinear  regression  will  depend  on  the 
region  in  which  the  data  are  most  dense  (Hislevy,  1986c);  that  is,  it  will 
be  related  to  the  item  mean,  or  difficulty  (as  defined  in  classical  test 
theory).    From  this  perspective,  it  is  not  surprising  that  linear  factor 
analysis  of  dichotomous  items  often  produces  a  second  factor,  typically 
called  a  difficulty  factor,  that  is  related  to  item  difficulty,  but  appears 
to  be  unrelated  to  any  substantive  property  of  the  items.    There  can,  in 
fact,  be  more  than  one  such  spurious  factor  (as  is  the  case  for  items  that 
form  a  perfect  Guttman  scale),  but  ordinarily,  only  one  is  substantial  in 
size. 

As  an  alternative  to  phi  coefficients,  tetrachoric  correlations  between 
items  can  be  obtained.    In  computing  tetrachorics,  it  is  assumed  that  the 
item  responses  are  functions  of  underlying  continuous  variables  that  have  a 
bivariate  normal  distribution.    The  model  dictates  that,  for  each  item, 
individuals  who  have  values  greater  than  a  certain  threshold  on  the 
underlying  response  /ariable  gel  that  item  correct;  individuals  with  values 
lower  than  the  threshold  get  it  wrong.    Using  the  bivariate  normality 
assumption,  the  correlation  between  the  unobserved  continuous  variables  can 
be  inferred  from  the  2  x  2  table  of  item  responses.  Tetrachoric 
correlations  do  not  provide  a  valid  measure  of  association  if  bivariate 
normality  does  not  hold.    Furthermore,  the  occurrence  of  guessing  violates 
the  above  model,  which  postulates  that  the  probability  that  an  individual 
gets  an  item  right  is  a  function  only  of  his  value  ^n  the  underlying 
response  variable.    When  guessing  does  occur,  factor  analysis  of 
tetrachorics  can  produce  spurious  factors  (see  Carroll,  1945,  1983;  Hulin, 
Drasgow,  and  Parsons,  1983).    Adjustments  for  guessing  are  theoretically 
possible,  but  often  lead  to  unacceptable  results  in  practice.     (Attempts  to 
adjust  for  the  effects  of  guessing  in  the  NAEP  analyses  are  discussed  in 
Section  10.1.3.2.1.)    Additional  problems  are  inaccuracies  in  the 
computation  of  tetrachorics  as  their  absolute  values  approach  unity,  the 
large  standard  errors  of  the  coefficients,  and  the  occurrence  of 
non-Gramian  natrices  of  sample  tetrachorics,  even  when  data  are  complete. 
(In  the  case  of  the  NAEP  analyses,  in  which  a  large  proportion  of  data  are 
missing  by  design,  the  negative  eigenvalues  tend  to  comprise  a  large 
proportion  of  the  trace  of  the  tetrachoric  matrix;  see  Section  10.1.3.1.2 
and  Table  10.1(3).) 

It  is  clear  that  conventional  factor  analysis  of  phi  and  tetrachoric 
correlations  is  not  a  satisfactory  means  of  investigating  dimensionality. 
Unfortunately,  no  uniformly  accepted  statistical  procedures  for  dimen- 
sionality assessment  exist  for  the  case  of  dichotomous  variables.    As  a 
result,  a  vast  literature  on  the  subject  has  developed,  particularly  during 
the  last  ten  years,  as  the  use  of  IRT  moclels  has  increased.    Some  methods 
which  have  gained  attention  recently  are  briefly  described  here;  more 
detailed  reviews  of  dimensionality  assessment  are  given  by  Hattie  (1984, 
1985),  Hulin,  Drasgow,  and  Parsons  (1983,  Chapter  8),  and  Mislevy  (1986c). 


247 


Factor--analytic  methods  that  have  been  proposed  to  overcome  the 
problems  described  above  include  factor  analysis  of  item  parcels^  nonlinear 
factor  analysis,  the  generalized  least  squares  methods  developed  by 
Christofferson  (1975)  and  Muthen  (1978),  and  the  full--information  maximum 
likelihood  method  of  Bock  and  his  associates  (Bock  &  Aitkin,  1981;  Bock, 
Gibbons,  &  Muraki,  1985). 

Factor  analysis  of  item  parcels  is  achieved  by  grouping  items  into 
meaningful  subsets  (the  so-called  parcels)  and  then  applying  conventional 
factor-analytic  methods  to  the  parcel  scores.    This  method  was  applied  by 
Cook  and  Eignor  (198A)  to  a  portion  of  the  NAEP  data  collected  in  1979-1980 
and  by  Cook,  Eignor,  Dorans,  and  Petersen  (1985)  to  SAT  data.  One 
practical  problem  with  this  approach  is  that  it  may  be  difficult  to 
classify  certain  items  a  priori.    Furthermore,  if  the  item  parcels  differ 
in  average  difficulty,  the  obtained  factor  structurt=  may  be  influenced  to 
an  undesirable  degree  by  item  difficulty,  as  in  the  dichotomous  case 
(Kingston  &  Dorans,  1982).    A  more  fundamental  drawback  is  that  this 
approach  dc^s  not  assess  directly  the  properties  of  individual  items. 
Because  item  scores  do  not  enter  the  analysis,  it  is  possible  for  items 
that  measure  a  property  other  than  the  one  of  interest  to  go  undetected. 
Finally,  the  application  of  this  approach  to  the  complete  NAEP  data  set  is 
virtually  ruled  oui  because  examinees  do  not  all  receive  the  same  items 
(see  Section  10.1.3.1).    (The  Cook  and  Eignor  [198AI  analysis  was  based  on 
a  subset  of  examinees  who  had  been  administered  the  same  items.) 

In  a  series  of  publications,  McDonald  presented  a  theory  of  nonlinear 
factor  analysis  (e.g.,  McDonald,  1967,  1983).    In  McDonald's  model, 
P(x^^  =1  I  9),  the  conditional  probability  that  an  examinee  answers  an 
item  correctly,  given  his  observed  vector  of  latent  traits,  9,  is 
expressed  as  a  nonlinear  function  of  the  latent  traits.    For'example,  in 
one  version  of  the  model,  P(x.     =1  |  9)  is  expressed  as  a  weighted  sum  of 
polynomial  functions  of  the  latent  traits.    Simulation  stud'es  of  the 
effectiveness  of  nonlinear  factor  analysis  as  a  method  of  d:  ensionality 
assessment  have  led  to  inconsistent  findings.    Hambleton  anc  lovinelli 
(1986)  found  that  a  one-factor  polynomial  model  with  linear  and  quadratic 
terms  provided  a  good  fit  to  a  simulated  unidimensional  data  set,  unlike  a 
one- factor  linear  model.    Furthermore,  a  two-factor  polynomial  model 
provided  a  good  fit  to  two-dimensional  simulated  data.    Based  on  this  and 
other  findings,  Hambleton  and  Rovinelli  concluded  that  nonlinear  factor 
analysis  is  one  of  the  most  promising  methods  for  assessing  the 
dimensionality  of  dichotomous  data.    On  the  other  hand,  Hattie  (198A) 
concluded  that  the  sum  of  absolute  residual  covariances  from  nonlinear 
factor  analysis  could  not  be  recommended  as  an  index  of  dimensionality 
because  results  fro--  the  unidimensional  and  multidimensional  data  sets  were 
not  sufficiently  distinct. 

Christofferson  (1975)  developed  a  factor-analytic  niethod  for 
dichotomous  data  that  involves  expressing  the  expected  proportion  correct 
for  each  item  and  for  the  joint  proportions  correct  for  each  pair  of  items 
as  a  function  of  item  thresholds  (see  abovo.  and  Section  10.1.3.4,  below) 
and  factor  loadings.    The  weighted  distance  between  the  observed  and 


248 


266 


ERIC 


modeleo  values  of  these  proportions  is  then  minimized  using  generalized 
least  squares  (GLS)  methods.    Christof ferson's  solution  makes  use  of  the 
information  contained  in  the  three-and  four-way  margins  of  the  n-way 
contingency  table  of  item  responses  (see  Appendix  2  in  Christof ferson , 
1975;  Mislevy,  1986c),  unlike  conventional  factor  analysis  of  phi  or 
tetrachoric  correlations,  which  makes  use  of  only  the  one-and  tvo-way 
marginals.    Solving  for  estimates  of  the  thresholds  and  loadings  requires 
numerical  integration  and  is  therefore  computationally  burdensome.  Muthen 
(1978)  developed  an  alternative  GLS  method  that  reduces  the  computational 
requirements  to  some  degree.    However,  application  of  both  Chris tof ferson' s 
and  Muthen' s  methods  is  currently  limited  to  about  25  items.    Bock  and 
?ssociates  developed  a  factor-analytic  approach  for  dichotomous  data, 
called  full-information  factor  analysis  (Bock,  Gibbons,  &  Muraki ,  1985) 
because  it  uses  information  contained  in  t'\e  joint  frequencies  of  all 
orders  of  the  item  responses.    This  method,  detailed  in  Section  10.1.3.4 
below,  makes  use  of  the  marginal  maximum  likelihood  methods  of  Bock  and 
Aitkin  (1981)  for  estimat'ng  the  parameters  of  the  common  factor  model. 

In  addition  to  factoi-analytic  approaches,  a  number  of  other  methods  of 
dimensionality  assessment  have  been  proposed.    For  example,  Bejar  (1980) 
has  recommended  comparing  the  estimated  item  difficulties  (i.e.,  the 
estimates  of  the  b    of  equation  1)  obtained  by  calibrating  a  complete  set 
of  test  Items  to  those  obtained  by  performing  the  calibration  separately 
within  content  areas.    (Bejar  [1980J  also  proposed  an  additional  procedure, 
which  involves  computing,  for  each  content  area,  a  scaled  score 
corresponding  to  each  of  the  two  sets  of  item  parameter  estimates,  and  then 
comparing  the  results  obtained  by  fitting  a  one-factor  model  to  each  of  the 
two  sets  of  scores.)  Although  Bejar' s  (1980)  application  of  che  method 
appeared  to  yield  useful  results,  Hambleton  and  Rovineili  (1986)  found  that 
the  method  was  unable  to  discriminate  between  one-and  two-dimensional 
simulated  data  sets.    Another  method  that  has  been  proposed  is  analysis  of 
the  residual  differences  between  the  observed  proportions  of  correct 
responses  for  individuals  within  various  categories  of  proficiency  and  the 
estimated  probabilities  of  correct  responses  according  to  the 
unidimensional  item  response  model  deemed  appropriate  (e.g..  Equation  1). 
Various  methods  of  residual  analysis  hr\ve  been  proposed;  reviews  are  given 
b"  Traub  and  Wolfe  (1981)  and  Hattie  (1985).    The  rationale  is  that  if  the 
model  fits  well,  the  data  can  be  assumed  to  be  consistent  with 
unidimensionality.    A  major  drawback  is  that  large  residuals  may  be  the 
result  of  model  violations  other  than  multidimensionali ty.    Hambleton  and 
Rovineili  (1986)  concluded  that  indices  based  on  the  size  of  average 
residuals  obtained  after  fitting  one-,  two-,  and  three-parameter  logistic 
models  were  not  capable  of  detecting  multidimensionali ty.    It  should  be 
noted  that  Hambleton  and  Rovineili  did  not  report  any  investigation  of  the 
pattern  of  residuals. 

10-1-3    Methods  Used  to  Assess  the  Dimensionality  of  NAEP  Reading  Data 

The  proposed  methods  cf  dimensionality  assessment  differ  in  terms  of 
the  assumptions  needed,  the  hypothesis  tested,  and  the  statistical 
artifacts  that  affect  interpretation.    Rather  than  selecting  a  single 

249 


26"7 


method  of  dimensionality  assessment  for  the  NAEP  reading  data,  we  applied 
four  different  techniques,  described  in  this  section.    For  descriptive 
purposes,  we  included  principal  components  analysis  (PCA)  of  phi  and 
tetrachoric  correlations,  as  described  in  Section  10.1.3.2.    As  an 
experimental  analysis,  we  also  applied  PCA  to  the  image  correlation  matrix, 
a  method  based  on  the  work  of  Guttman  (1953)  and  Kaiser  and  Cerny  (1979), 
described  in  Section  10.1.3.3.  Bock's  full-information  factor  analysis, 
discussed  in  Section  10.1.3.4,  was  applied  to  a  subset  of  the  data. 
Finally,  we  used  the  method  of  Rosenbaum  (1984a,  1984b),  described  in 
Section  10.1.3.5,  which  involves  examination  of  the  partial  association  for 
each  pair  of  items,  conditional  on  the  total  score  on  the  remaining  items. 
Prior  to  a  discussion  of  these  methods,  the  properties  of  the  NAEP  database 
are  described. 


10.1.3.1    Properties  of  NAEP  Data 


10.1.3.1.1    Items  Included  in  Dimensionality  Analyses 

All  reading  items  that  were  included  in  the  IRT  scaling  and  were  also 
spiralled  with  other  items  (see  Section  10.1.3.1.2  and  Chapter  10.2)  were 
used  in  the  dimensionality  analyses.    All  subjects  who  responded  to  one  or 
more  of  these  items  were  included.  The  number  of  subjects  and  items 
available  for  the  analyses  is  shown  in  Table  10.1(1).    (The  NAEP  item 
numbers  for  all  items  included  in  the  dimensionality  analyses  are  given  in 
Appendix  1  of  this  chapter.)  As  indicated,  there  were  about  100  items  per 
grade/ age.    Twenty-five  of  the  items  included  in  the  analyses  were 
administered  to  all  three  grade/ages.    The  range  and  mean  of  the 
proportions  correct  for  each  of  the  three  grade/ages  and  for  the  25 
across-grade/age  items  are  given  in  Table  10.1(1).    As  shown,  the  number  of 
students  per  grade/age  was  roughly  26  to  29  thousand.    As  a  result  of  the 
number  of  items  and  subjects  in  the  data  base,  certain  analyses  were  ruled 
out  because  they  were  too  costly  or  exceeded  computing  capabilities.  In 
other  cases,  dimensionality  analysts  were  performed  on  only  a  subset  of 
items  to  minimize  the  cost  and  the  computational  burden. 

Ninety-four  percent  of  the  NAEP  reading  items  included  in  the  analyses 
were  multiple  choice  items  with  three  to  six  response  choices.  The 
remainder  were  essay  items  in  which  the  respondent  was  asked  to  react  to  a 
reading  passage.    Essay  items  were  scored  on  a  scale  of  1  to  5,  which  was 
later  dichotomized.  All  items  were  classified  by  reading  experts  on  the 
basis  of  objective  (deriving  information  vs.  integrating  and  applying 
information),  stimulus  (short  or  long  reading  passages,  document,  or 
picture),  and  content  (fictional  story,  poem,  informational  passage,  social 
studies,  i>cience,  arts  and  humanities,  or  life  skills).    These  item 
properties,  as  well  as  a  further  classification  of  the  items  bailed  on  the 
work  of  Mosenthal  (1985),  were  used  in  attempting  to  interpret  analysis 
results.    (A  subset  of  reading  items  that  were  designed  to  assess  study 
skills  were  not  included  in  the  dimensionality  analysis  because  they  were 
not  scaled  using  IRT.    That  these  items  differed  from  the  remaining  reading 
items  was  suggested  by  examination  of  the  item  content,  as  well  as 


250 


empirical  evidence:    For  a  subset  of  examinees,  number-right  scores  on 
blocks  of  study  skills  items  and  on  blocks  of  conventional  reading  items 
were  obtained.    The  attenuation-corrected  correlations  between  study  skills 
blocks  and  conventxonal  reading  blocks  tended  to  be  lower  than 
intercorrelations  between  conventional  reading  blocks.    Many  of  the  itemc 
which  led  to  departures  from  unidimensionality  in  Jungeblut's  [1984] 
analyses  of  the  1979-1980  NAEP  data  were  study  skills  items  [Jungeblut, 
personal  communication,  October  1985J.) 


10.1.3.1.2    Missing  Data  Pattern 

A  new  feature  of  the  Year  15  NAEP  design  was  the  use  of  balanced 
incomplete  block  (BIB)  spiralling  to  assign  lest  items  to  booklets  (see 
Messlck,  Beaton,  &  Lord,  1983;  Beaton,  1984;  and  Chapter  5).  BIB  spiralling 
combines  the  features  of  conventional  spiralling  and  multiple  matrix 
sampling.    As  in  ordinary  multiple  matrix  sampling,  each  item  is 
administered  a  prescribed  number  of  times,  although  examinees  receive 
different  su^»sets  of  items.    BIB  spiralling  has  the  additionil  feature  that 
each  pair  of  items  is  assessed  a  prescribed  number  of  times.    In  NAEP, 
reading  items  were  first  grouped  into  blocks,  consi'ting  in  most  cases  of  8 
to  12  items,  which  were  then  assigned  to  test  booklet.s  according  to  a 
design  that  provided  the  desired  links  between  items.    This  resulted  in  a 
set  of  approximately  60  different  test  booklets  per  grade/age,  which  were 
assigned  to  respondents  in  a  random  sequence. 

A  major  advantage  of  BIB  spiralling  is  that  it  permits  the  estimation 
of  inter-item  correlations.    However,  the  resulting  matrix  of  correlations, 
referred  to  here  as  the  BIB  matrix,  has  an  unusual  pattern  of  missing  data. 
In  the  case  of  the  NAEP  reading  data,  the  number  of  respondents  available 
to  estimate  correlations  between  items  in  the  same  block  is,  in  most  cases, 
nine  times  the  number  of  respondents  available  for  the  estimation  of 
correlations  between  items  that  fall  within  different  blocks.  Furthermore, 
the  correlations  of  items  in  one  block,  say.  A,  with  those  in  another 
block,  B,  are  not  in  general  based  on  the  same  group  of  respondents  as  the 
correlations  of  Block  C  items  with  Block  D  items.    Because  of  the 
spiralling  procedure  used  to  assign  booklets  to  respondents,  the  missing 
data  that  result  from  the  implementation  of  a  BIB  design  can  be  regarded  as 
random.    However,  in  using  a  BIB  correlation  matrix  rather  than  a 
conventional  correlation  matr-      we  are  implicitly  making  the  assumption 
that  the  correlations  between  items  are  not  subject  to  context  effects. 
If,  for  example,  the  population  correlation  between  two  itemS,  i  and  j, 
varied  depending  on  whether  k  were  administered  with  i  and  j,  then  the 
sample  correlation  of  i  and  j  in  the  presence  of  k  would  not  be  an  estimate 
of  the  same  population  parameter  as  the  sample  correlation  of  i  and  j  in 
the  absence  of  k.    Computation  of  a  BIB  matrix  involves  averaging  these 
sample  correlations,  which  would  be  unu€3irable  under  these  circumstances. 

Even  if  the  assumption  of  no  context  effects  is  justified,  there  are 
other  ways  in  which  the  properties  of  the  BIB  matrix  differ  from  those  of  a 
conventional  correlation  matrix.    For  example,  .he  standard  errors  of  the 
within-block  correlations  are  smaller  than  those  of  the  between-block 


251 


Table  10.1(1) 


Number  of  Items  and  Students  Available  for 
Dimensionality  Analyses 


Number  of  Proportions  Correct  Number  of 

Grade/Age  Items  Minimum     Maximum     Mean  Students 


A/9 

108 

.04 

.93 

.50 

26,087 

8/13 

100 

.09 

.98 

.63 

28,405 

11/17 

95 

.21 

.96 

.70 

28,861 

Across 
Grade/Ages 
(Common  Items) 

25 

.13 

.90 

.53 

83,353 

252 


ERIC 


2Vu 


correlations.    Also,  the  BIB  matrix  may  have  negative  eigenvalues,  unlike  a 
conventional  correlation  matrix.    As  detailed  in  Section  10.1.3.1  and 
Tables  10.1(2)  and  (3),  both  phi  and  tetrachoric  matrices  of  NAEP  items  had 
negative  roots  in  most  cases.    For  analyses  that  required  a  matrix  that  was 
at  least  positive  semi-definite,  an  \djustment  procedure,  described  in 
Appendix  2  of  this  chapter,  was  applied.    Although  there  is  no  indication 
that  analysis  results  were  affected  in  any  major  way  by  the  use  of  BIB 
matrices  or  their  adjusted  counterparts,  the  statistical  properties  of 
these  matrices  are  not  fully  understood  at  present.    The  special  properties 
of  BIB  matrices  and  the  impact  of  BIB  spiralling  on  the  NAEP  dimensionality 
analyses  are  discussed  in  further  detail  in  Section  10.1.4. 

In  addition  to  the  BIB  missing  data,  which  can  be  regarded  as  random, 
there  are  two  major  categories  of  non-random  missing  data:    omitted  items 
and  items  that  the  respondent  was  administered  but  did  not  reach. 
Unanswered  items  occurring  after  the  last  valid  response  within  a  block 
were  considered  "not  reached."    (In  administering  the  items,  each  block  was 
timed  separately.)    Unanswered  items  that  occurred  prior  to  the  last  valid 
response  (and  were  not  a  result  of  the  BIB  design)  were  coded  as  omits. 
The  category  of  omitted  items  was  defined  to  include  as  well  any  items 
marked,  "I  don't  know,"  which  was  a  recpon^ie  alternative  for  all  multiple 
choice  items.    The  treatment  of  not  reached  and  omitted  items  in  each  of 
the  dimensionality  analyses  is  discussed  in  Sections  10.1.3.2  to  10.1.3.5. 


10.1.3.2    Principal  Component  Analysis  of  Inter-item  Correlation  Matrices 

Despite  the  drawbacks  described  in  Section  10.1.2,  principal  component 
analyses  (PCA)  of  the  phi  and  tetrachoric  matrices  for  each  grade/age  were 
conducted  for  descriptive  purposes.    In  addition,  analyses  including  all 
respondents  were  performed,  based  on  the  25  items  common  to  all  three 
grade/ages.    It  can  be  argued  that  the  results  of  these  analyses  represent 
a  "worst  case";  that  is,  because  the  analyses  tend  to  produce  spurious 
factors,  results  that  were  free  of  artifacts  would  be  expected  to  be  more 
consistent  with  unidimensionality. 

Items  that  were  not  reached  were  excluded  from  the  analysis;  omitted 
items  were  scored  as  incorrect.    For  each  of  the  four  phi  matrices.  Table 
10.1(2)  gives  the  range  of  inter-item  correlations,  the  median  correlation, 
the  first  five  eigenvalues  and  the  percent  of  the  trace  they  represent, 
and,  as  an  index  of  the  degree  to  which  the  matrix  departed  from  positive- 
def ini teness,  the  sum  of  the  negative  eigenvalues  as  a  percent  of  the  trace 
of  the  matrix.    The  median  .sample  size  (N)  on  which  the  correlation 
coefficients  were  based  (see  Section  10.1.3.1.2)  is  also  given.  The 
corresponding  information  for  the  tetrachoric  matrices  is  given  in  Table 
10.1(3).    The  results  in  Tables  10.1(2)  and  (3)  are  based  on  analyses  that 
incorporated  the  respondents'  sampling  wei^lus  (see  Chapter  13.1). 
Unweighted  analyses  yielded  almost  identical  results. 

It  is  clear  that,  for  each  of  the  eight  matrices,  there  is  a  large 
first  root,  constituting  between  17  and  25  percent  of  the  trace  for  the  phi 
matrices  and  between  30  and  AO  percent  for  the  tetrachoric  matrices  (but 


253 


Table  10.1(2) 

Eigenvalues  and  Descriptive  Statistics  for  Phi  Matrices 


Grade  4/Age  9  (108  iteins) 

First  5  Roots  Pet.  of  trace                  Descriptive  Statistics 

23.9  22  Median    N  280 

3.3  3 

2.5  2  Range  of  r         -.18,  .53 

2.4  2  Median  r  .19 

2.2  2  Neg.  roots  as  pet.  of  trace  3 

Grade  8/Age  13  (100  items) 

First  5  Roots  Pet.  of  trace                  Descriptive  Statistics 

17.0  17  Median    N  323 

2.6  3 

2.5  2  Range  of  r         -.15,  .60 

2.2  2  Median  r  .14 

2.1  2  Neg.  roots  as  pet.  of  trace  2 

Grade  11 /Age  17  (95  items) 

First  5  Roots  Pet.  of  trace                  Descriptive  Statistics 

17.5  18  Median    N  331 

3.1  3 

2.3  2  Range  of  r  -.16,  .68 

2.1  2  Median  r  .16 

2.0  2  t!eg.  roots  as  pet.  of  trace  2 

All  Grade/Agf^s  Combined  (25  items) 

First  5  Roots  Pet.  of  trace                  Descriptive  Statistics 

6.3  25  Median    N  919 

1.5  6 

1.2  5  Range  of  r  .29,  .57 

1.1  5  Median  r  .18 

1.0  4  Neg.  roots  as  pet.  of  trace  0 


254 


ERIC 


27  d 


Table  10.1(3) 

Eigenvalues  and  Descriptive  Statistics  for  Ttitrachoric  Matrices 

Grade  4/Age  9  (108  iie^s) 
First  5  Roots    Pet.  of  t.ace  Descriptive  Statistics 

^'^•^  37  Median    N  280 

o.^  6 

^  Range  of  r  -.^6,  .81 

3-7  3  Median  r  .35 

3  Neg.  roots  as  pet.  of  trace  27 

Grade  8/Age  13  (100  items) 

First  5  Roots    Pet.  of  trace  Descriptive  Statistics 

30.0  30  Median    N  323 

4.3  4 


4  Range  of  r  -.34,  .81 


3-^  3  Median  r  .27 


3.3  3 


Neg.  roots  as  pet.  of  trace  21 


Grade  11/Age  17  (95  items) 

First  5  Roots    Pet.  of  trace  Descriptive^  Statistics 

32.0                  34  Median    N  331 
3.9  4 

3*3                    3  Range  of  r  -.38,  90 

3*0                    3  Median  r  .31 

^•^                    3  Neg.  roots  as  pet.  of  trace  19 

All  Grade/Ages  Combined  (25  items) 

First  5  Roots    Pet.  of  ttace  Dtseriptivo  Statistics 

10-0                  ^0  Median    N  919 
1.6  6 

^•^                    5  Range  of  r  .05,  .80 


1.2 


Median  i  .33 


^  Neg.  roots  as  pet.  of  trace  0 


255 


27j 


note  that  the  negative  roots  constitute  up  to  27  percent  of  the  trace  for 
tetrachoric  matrices).    The  second  root  is  always  less  than  one-fourth  of 
the  first.  Following  the  sharp  drop-off  between  the  first  and  the  second, 
the  remaining  roots  trail  off  gradually.    These  findings  are  reassuring  in 
that  they  are  consistent  with  a  large  first  dimension.    (The  size  of  the 
first  component  may  appear  sirall  to  those  who  are  unaccustomed  to  examining 
the  results  of  item-level  factor  analyses.    In  interpreting  these  findings, 
however,  it  is  important  to  consider  that  the  median  inter-item 
correlations  are  low:    between  .14  and  .19  for  the  four  ph'  matrices  and 
between  .27  and  .35  for  the  tetrachoric  matrices.    Results  of  PCA  of  phi 
matrices  computed  from  simulated  unidimensional  data  showed  that  the  first 
root  typically  constituted  25  to  30  percent  of  the  trace;  see  Section 
10.1.3.3  and  Table  10.1(5).)    The  loadings  on  the  first  principal  component 
were  not  related  in  any  obvious  way  to  the  item  classifications  discussed 
in  Section  10.1.3.1.1. 


10.1.3.2.1    Application  of  ^  assing  Corrections  to  Tetrachoric 
Correlations 

When  it  is  possible  for  items  to  be  answered  correctly  through 
guessing,  the  nagnitud^i  of  observed  tetrachoric  correlations  is  related  to 
item  difficulty  (e.g.,  see  Hulin,  Drasgov,  &  Parsons,  1983,  pp.  249-255). 
To  eliminate  this  problem,  Carroll  (1945)  suggested  that  the  frequencies  in 
the  2x2  tables  of  responses  for  each  pair  of  items  be  adjusted  to 
"remove"  the  effects  of  guessing  and  that  tetrachorics  be  computed  on  the 
basis  of  these  adjusted  frequencies.    In  Carroll's  model,  it  is  implicitly 
assumed  that,  for  3ach  pair  of  items,  the  probability  of  getting  one  item 
right  by  guessing  is  independent  of  the  probability  of  making  a  correct 
guess  on  the  other  item.    In  applying  the  model,  it  is  typically  assumed 
that  guessing  is  random  and  that  the  probability  of  getting  an  item  right 
by  guessing  i^  therefore  equal  to  the  reciprocal  of  the  number  of  response 
choices.  To  determine  whether  it  would  be  a  useful  strategy  for  NAEP  data, 
Carroll's  correction  was  applied  to  the  item  responses  for  Grade  8/Age  13, 
setting  g.,  the  hypothetical  probability  of  guessing  right  on  item  j,  equal 
to  tha  reciprocal  of  the  number  of  response  choices  for  item  j,  excluding 
the  "I  don't  know"  alternative.    For  essay  items,  g.  was  set  to  0.  The 
results  were  clearly  unsatisfactory:    It  was  found  that  16  percent  of  the 
tetrachoric  coefficients  were  rendered  incomputable  because  of  negative 
adjusted  cell  frequencies.    Several  other  corrections  vere  investigated, 
but  deemed  unsatisfactory,  including  a  modification  of  Carroll's  correction 
in  which  the  input  g.  values  were  adjusted  so  as  to  avoid  the  occurrence  of 
negative  adjusted  ceil  frequencies  and  a  correction  in  which  each  g.  was 
set  equal  to  the  estimated  lower  asymptote,  c.  (see  equation  1)  of  the  item 
from  the  IRT  item  calibration.    (Note  that  Bock,  Gibbons,  &  Muraki  [1985] 
describe  a  modification  of  Carroll's  correction  that  apparently  produces 
satisfactory  results.    This  modified  correction  did  not  come  to  our 
attention  until  after  our  analyses  were  complete.) 


256 


274 


^°-^-3-3    Principal  Components  Analysis  ot  the  Image  Correlation  Matrix 

Guttman  (1953)  developed  a  theory  for  the  structure  of  quantitative 
variates  called  image  theory.    Image  theory  is  based  on  the  oartitioning  of 
a  variable  into  two  additive  segments:    the  part  that  can  be' predicted 
through  least  squares  linear  regression  of  that  variabls  on  all  the 
remaining  variables,  called  the  image,  and  the  error  of  prediction,  called 
the  anti-image.    Thus,  unlike  common  factor  theory,  image  theory  provides 
an  explicit  definition  for  the  common  part  of  a  variable.  Another 
difference  from  the  traditional  factor-analytic  approach  is  that,  in 
general,  the  anti-images  have  non-z^ro  covariances.    Guttman  shows  that 
common  factor  theory  may  be  viewed  as  a  special  case  nf  image  theory.  The 
relation  between  image  theory  and  other  factor-analytic  approaches  is 
further  examined  by  Harris  (1962)  and  reviewed  by  Hulaik  (1972). 

Suppose  that  n  variables  are  to  be  observed.    The  decomposition  of  the 
original  variates  into  images  and  anti-iniages  can  be  expressed  as 

Z  =  V  +  u 


~  "  ~~  [3] 
The  weight  matrix  W  is  defined  as 

H  =  i  -  [AI 
where  R  is  the  correlation  matrix  of  the  original  variates,  z,  and 

S'  =  Idiag  (R-')I-'  [5j 

The  off-diagonal,  of  W  contain  the  regression  weights  for  predicting  each 
of  the  variates  z  from  the  remaining  n  -  1  variates.    The  diagonals  of  W 
are  equal  to  zero  because  .  e  regression  of  a  variate  on  itself  is  not  5t 
interest. 

The  principles  of  image  theory  are  usually  applied  in  practice  by 
factor-analyzinc  G,  the  covariance  matrix  of  the  images,  given  by 


257 


ERIC 


27o 


I 


5  =  I(vv')  =  E(Wz)(Wz)'  (61 
=  KWzz'W' )  =  W  E(zz' )  W' 

=    WRW  =  (I  -  S^R'^)  R  (I  -  R~M' 

=  R  +       R~^        -  2S^ 

The  j*^**  diagonal  element  of  this  matrix  is  the  variance  of  the  j*^^  image, 
which  is  equal  to  the  squared^multiple       relation  coefficient  (SMC) 
obtained  by  regressing  the  j      variate       the  remaining  n  -  1  variates.  In 
this  sense,  G  resembles  the  "reduced  correlation  matrix"  of  common  factor 
analysis  witfi  SMCs  used  as  communality  estimates.    The  off-diagonals  of  G, 
however,  tend  to  be  slightly  smaller  ihan  those  of  the  reduced  correlation 
matrix  (Kaiser,  19'>3);  furthermore,  G  is  always  Gramian  (assuming  data  are 
complete),  unlike  a  correlation  matrix  with  SMCs  inserted  in  the  diagonal. 

As  an  alternative  to  the  analysis  of  the  G  matrix.  Kaiser  and 
Cerny  (1979)  recommencjed  principal  component  analysis  of  the  image 
correlation  matrix,  G  ,  given  by 

G-  .  D-"'  G  D-"'  ,71 


where 


diag  (G)  ^  I  -  S^  18] 


Kaiser  (1970;  see  also  Kaiser  &  Cerny,  1979)  conjectured  that  image 
analysis  would  be  well-suited  to  the  factor  analysis  of  dichotomous  data. 
He  noted  that  because  the  images  are  least  squares  predicted  values  of  one 
variate  based  on  the  remaining  n  -  1  variates,  "a  crude  appeal  to  the 
Central  Limit  Theorem  suggests  that  the  images  will  be  sensibly 
multivariate  normal,  a  set-up  which  is  well  known  not  to  produce  difficulty 
factors"  (Kaiser,  1970,  p.  407). 

As  an  experimental  approach  to  dirensionali ty  assessment,  principal 
component  analysis  of  the  image  correlation  matrix  was  applied  to  the  NAEP 
data  for  Grade  A/Age  9,  Grade  8/Age  13,  and  Grade  11/Age  17  and  to  the  25 
across-grade/age  items.    Modification  of  the  standard  equations  of  image 
analysis  was  required  because,  in  the  case  of  NAEP  data,  the  matrix  R  of 
weighted  phi  correlations  is  not  positive  definite  (see  10.1.3.2  and^Table 
10.1(2))  and  therefore  can  not  be  inverted.    An  adjustment  procedure, 
detailed  in  Appendix  2  of  this  chapter,  was  used  to  obtain  a  singular 
approximation  to  the  matrix  of  inter-item  correlations  and  a  pseudo-inverse 
of  this  adjure  ted  matrix.    Following  this,  the  pseudo-inverse  matrix  R"  was 
then  substituted  for  R"    in  the  formulas  for  W  and  S^  (equations  3  and  4), 
aj  recommended  by  Kaiser  and  Cerny  (1978).    Analogues  of  the  matrices  C, 
G  ,  and  D  (equations  6,  7,  and  8)  were  commuted  usin?  these  modified  forms 
of  W  and  S^ 


258 


o  276 
ERLC 


Resultr  of  the  image  analysis  were  superficially  appealing.    As  shown 
in  Table  10.1(A),  the  first  roots  jf  the  image  correlation  matrix  were 
often  considerably  larger  than  those  of  the  phi  matrix.    For  example,  they 
were  almost  three  times  as  large  in  the  across-grade  analysis.    However,  as 
described  below,  both  empirical  and  theoretical  examinations  of  this  method 
show  that  it  cannot  provide  the  correct  answer  about  dimensionality  in  the 
dichotomous  case. 

To  investigate  the  properties  of  the  image  analysis  solution,  PCA  of 
the  image  correlation  matrix  was  applied  to  several  simulated  data  sets 
generated  from  a  unidimensional  model.    The  simulation  studies  were 
conducted  as  follows: 

(1)  Assuming  a  three-parameter  logistic  model,  NAEP  reading  items 
were  calibrated  with  the  LOGIST  program  (M.  S.  Wingersky, 
1983)  using  actual  NAEP  data.    Thirty  of  these  items  wers 
randomly  selected  for  this  simulation  run. 

(2)  One  thousand  pseudo-random  values  from  a  normal  distrib»!tion 
with  mean  zero  and  unit  variance  were  then  generated.  These 
represent  tueta  or  proficiency  values  for  N  =  1000  examinees. 

(3)  The  three-parameter  logistic  function  (Equation  1)  was  used  to 
obtain  the  n  x  N  =  30  x  1000  values  of  P^  . ,  the  probability 
that  person  i  gets  item  j  correct.  The  it^m  parameters  a.,  b., 
and       were  obtained  from  step  1  and  the  9^  values  from  step^ 

(4)  Corresponding  to  each  value  of  P    ,  a  pseudo-random  value  U. 
was  generated  from  a  uniform  distribution  on  the  interval 
[0,1].    If  U.j  was  less  than  P.  ,  item  j  was  scored  as  correct 
for  person  i;  otherwise  it  was  scorec  as  incorrect.  The 
correlation  matrix  of  these  simulated  data  was  then  obtained 
and  the  image  procedure  applied. 

Table  10.1(5)  shows  the  first  five  roots  of  the  phi  and  image 
correlation  matrices  for  one  of  the  simulated  data  sets.    Whereas  the  first 
root  of  the  phi  matrix  was  only  about  one  quarter  of  the  trace  in  the 
simulation,  the  first  root  of  the  image  correlation  matrix  was  about  80 
percent  of  the  trace.    Other  simulated  unidimensional  data  sets  produced 
similar  values.    If  the  size  of  the  first  root  is  used  as  a  criterion,  the 
image  analysis  technique  appears  to  be  superior  to  PCA  of  the  phi  matrix  in 
revealing  the  true  unidimensional  structure  underlying  the  data.  However, 
as  in  the  case  of  the  phi  matrix,  the  loadings  of  items  on  the  second 
principal  component  of  the  ima^e  correlation  matrix  have  substantial 
correlations  with  the  proportions  correct  for  the  items:    the  correlations 
were  .85  for  the  phi  matrix  and  .65  for  the  image  correlation  matrix.  This 
makes  it  clear  that  the  image  approach  does  not  eliminate  the  problem  of 
difficulty  factors. 


259 


Table  10.1(4) 


Eigenvalues  of  the  Image  Correlation  Matrix 

Grade  4/Age  9  (108  items) 
First  5  Roots      Pet.  of  trace 


27.3  25 

9.5  9 

3.7  3 

3.2  3 

2.7  3 


Grade  8/Age  13  (100  items) 
First  5  Roots      Pet.  of  trace 


23.2  23 

9.5  9 
3.9  4 
2.8  3 

2.6  3 


Grade  11/Age  17  (95  items) 

First  5  Roots  Pet.  of  traee 

25.8  27 

5.7  6 

4.3  4 

3.4  4 
3.3  i 

All  Grade/Ages  Combined  (25  items) 

First  5  Roots  Pet.  of  traee 

18.0  72 

2.0  8 

1.1  5 
0.7  3 
0.6  2 


260 


278 


Table  10.1(5) 


First  Five  Eigenvalues  of  Correlation  and  Image 
Correlation  Matrices  for  Simulation  Data 
(?0  Itenis  with  NAEP  Item  Parameters) 


Phi  Matrix  Image  Correlation  Matrix 

First  5         Pet.  of  First  5         Pet.  of 


Roots 

Trace 

Roots 

Trace 

7.7 

26 

23.8 

79 

1.7 

6 

2.6 

9 

1.1 

4 

0.5 

2 

1.0 

3 

0.5 

2 

1.0 

3 

0.4 

1 

Correlation  of  Loadings  on  Second  Principal 
Component  with  Proportions  Correct 


.85  .65 


261 


27j 


Upon  consideration,  it  seems  unrealistic  to  expect  the  image  approach 
to  produce  an  accurate  reflection  of  the  number  of  dimensions,  when  it  is 
known  that  factoring  the  phi  matrix  does  not*    Af^er  all,  the  image 
cove^-'ince  matrix  G  can  be  expressed  as  the  sum  of  three  terms,  each  of 
which       a  function  of  the  phi  matrix.    In  addition,  application  of  the 
image  approach  to  dichotomous  data  involves  the  assumption  of  a  linear 
regression  model  which  is  known  to  be  violated.    McDonald  and  Ahlawat 
(1974)  expressed  doubts  about  the  use  of  image  analysis  in  the  dichotomous 
case,  noting  the  relations  between  the  eigenvalues  of  G  and  those  of 
R  -  S^,  the  reduced  correlation  matrix  with  SHCs  as  coinmunality  estimates 
(see ''Harris,  1962). 

Because  it  was  evident  from  both  a  theoretical  and  an  empirical 
perspective  that  the  image  approach  produces  misleading  results  in  the 
dichotomous  case,  attempts  to  interpret  the  findings  were  discontinued. 


10.1.3.4    Full-information  Factor  Analysis 

A  factor-analytic  method  that  was  designed  for  dichotomous  data  is 
full-information  factor  analysis  (Bock,  Gibbons,  &  Muraki,  1985;  see  also 
Mislevy,  1986c),  which  is  implr-iented  in  the  TESTFACT  program  (Wilson, 
Wood,  &  Gibbons,  1983).    Unlike  the  methods  described  in  Sections  10.1.3.2 
and  10.1.3.3,  this  method  does  not  require  the  computation  o£  correlation 
coefficients,  but  operates  instead  on  the  set  of  distinct  item  response 
vectors.    In  contrast  to  factor  analysis  of  correlation  coefficients,  which 
makes  use  of  only  the  pairwise  joint  frequencies  of  item  responses.  Bock's 
fall-information  solution  uses  information  contained  in  the  joint 
frequencies  of  all  orders.    In  applying  this  method,  a  particular  model  for 
the  item  responses  mus:  be  assumed.    In  the  case  of  the  NAEP  data,  the 
selected  model  was  a  multivariate  generalization  of  the  three-parameter 
normal  ofive  in  which  each  item  is  allowed  to  load  on  multiple  factors. 
The  mode    can  be  developed  by  first  assumin<j  that  underlying  the  response 
of  person  i  to  item  j  is  a  response  process  variable  defined  as 

K 

y.  .  =    E    X.,  e,.  +  V,  [9J 

k=l 

where  9.  ,  represents  the  value  of  the  k^^  latent  variabl'j  (factor),  k  =  1, 
2....  K,  for  the  i^^  individual,  i  =1,  2,  ...  N,  X     is  the  loading  of  the 
j      item,  j  =  1,  2,  ...  n,  on  the  k     latent  variable,  and  v    is  a 
residual  t^rm  associated  with  item  j.    The  response  process  variables  are 
assumed  to  have  mean  zero  a.id  variance  one.  The  observed  score  of  the  i 
examinee  on  the  j^    item,  x.   ,  takes  on  a  value  of  1,  indicating  a  correct 
score,  if  y.  .  exceeds  y.  >  the  threshold  for  the  j      item.  Otherwise, 
X.,  =0.    iVit  is  assumed  that  the  residua''*'  v   are  independen^tly 
distributed  as  N(0,<r.),  the  conditional  probability  that  the  i  examinee 
gets  the  j^*^  item  correct,  given  that  his  values  on  the  latent  variable  are 
equal  to  0.  can  be  expressed  as 


262 


K 

y  -  ^  >^ 

P(x^^  =  1  I  e.)  =  _i_r    exp  [-1/2  (  Z   )2j  jioj 

This  is  a  multivariate  generalization  of  tne  two-parameter  normal  ogive 
model  (see  Lord  &  Novick,  1968). 

This  model  can  be  modified  to  allow  for  the  possibility  of  ffuessinc  bv 
substituting  ^         •&  j 


(9.)  =  c.  +  (1  -  c^)  F^(e.)  HI] 

for  F  (Gj),  where  c    represent,   the  probability  that  an  individual  with 
very  low  ability  gets  the  item  correct.    This  multivariate  generalization 
ot  the  three-parameter  normal  ogive  model  was  applied  in  the  NAEP  analyses. 
The  c    values  are  treated  as  fixed  constants  in  the  full-information  factor 
analysis.    The  c    parameters  were  estimated  a  priori  using  BILOG  (Mislevy  & 
Bock,  1982)  and  then  input  to  the  TESTFACT  program.    NAEP  items  that  were 
coded  as  "not  reached"  (see  Section  10.1.3.1.2)  were  not  included  in  the 
analysis.    Omitted  items,  on  the  other  hand,  were  scored  correct  with 
probability  c^ .    Under  this  strategy,  examinees  who  omit  an  item  havR  thi 
same  theoretical  probability  of  getting  the  item  correct  as  examinees  who 
guess  in  the  absence  of  any  information. 

Incorporating  the  item  response  function,  F*(e.),  defined  in  Equation 
11,  the  marginal  probability  of  the  s"""  response  pattern  can  be  expressed 
as  • 

Ps  =  P(i  =  is)  =  r...r     n    F*  (e)'''^-'[i  -  F  (e)i'"''^f(e)de  [121 

^''!'^?  \j  response  to  thej'^''  item  in  the  s'^''  response  pattern, 

s  -  1,  2,  ...    S,  and  S  <  min  (2  ,  N)  is  the  number  of  response  patterns. 
It  is  further  assumed  in  this  application  that  f(e)is  the  multivariate 
normal  distribution  vUh  mean  0  and  covariance  matrix  I.    Now,  if  it  is 
assumed  that  the  counts  of  the "distinct  response  patterns  follow  a 
multinomial  distribution,  the  lilielihood  of  the  matrix  X  of  observed  counts 
of  distinct  response  patterns  can  be  expressed  as:  ~ 


P(X>  =    p/    P /   ...  p 


r, !r, !  ...  r  ! 

12  s 


•  1 


s 


[131 


where  P^  is  given  by  Equatiin  12. 


263 


The  quantities  P    are  approximated  using  numerical  integration.  The 
marginal  maximum  likelihood  method  of  Bock  and  Aitkin  (1981),  which  is 
based  on  earlier  work  by  Bock  and  Lieberman  (1970),  is  then  applied  to 
Equation  13  to  obtain  estimates  of  the  factor  loadings  and  thresholds  for 
each  item  (see  Bock,  Gibbons,  &  Muraki,  1985;  Mislevy,  1986c). 

If  sample  size  is  sufficiently  large,  a  test  of  the  fit  of  the  K-factor 
model  relative  to  a  general  multinomial  alternative  can  be  obtained  using  a 
chi-square  approximation  to  the  likelihood  ratio  test.    The  model  can  be 
re-estimated  and  the  test  repeated  for  successive  values  of  K.  The 
difference  between  these  chi-square  statistics  is  also  distributed  as 
chi-square  (under  the  hypothesis  that  the  more  restrictive  model  is 
correct)  and  can  be  used  to  test  the  improvement  in  model  fit  that  is 
achieved  by  allowing  the  number  of  latent  variables  to  increase.    The  test 
of  change  in  model  fit  has  been  shown  to  perform  well  even  when  the 
frequency  table  is  sparse  (Haberman,  1977). 

Because  the  TESTFACT  program  is  very  expensive  to  run,  full-information 
factor  analysis  was  applied  only  to  42  items  for  Grade  8/Age  13.  These 
items,  which  were  chosen  to  maximize  the  chances  of  detecting 
multidimensionality,  were  intended  to  represent  four  distinct  item  types: 
general  reading  comprehension,  inference  of  word  meaning  fron.  context,  life 
skills,  and  essay.    The  comprehension,  word  meaning,  and  essay  items  all 
referred  to  passages  the  examinee  was  asked  to  read.    Some  passages  were 
fictional  stories;  others  pertained  to  an  academic  content  area,  such  as 
science  or  social  studies.    The  life  skills  items  were  based  on  docun.ents 
that  might  be  encountered  in  everyday  life,  such  as  a  portion  of  a 
telephone  directory,  a  grocery  store  coupon,  or  an  advertisement. 
Responses  to  these  42  items  were  sent  to  Bock  and  his  associate,  Michele 
Zimowski,  who  conducted  the  analysis. 

The  analysis  was  based  on  the  raw  rather  than  the  weighted  frequency 
table  of  item  responses.    Because  sampling  weights  have  liti.±e  effect  on 
variances  and  covariances,  they  are  unlikely  to  have  much  effect  on  factor 
analysis  results  (Bock,  personal  communication,  November  1985). 

Examination  of  the  results  led  to  the  conclusion  that  a  one-factor 
solution  could  be  retained.    The  single  factor  accounted  for  about  39 
percent  of  the  total  variance.    In  the  unrotated  two-factor  solution,  the 
first  factor  accounted  for  about  36  percent  of  the  total  variance;  the 
second  factor  accounted  for  only  4  percent.    Promax  rotation  (Hendrickson  & 
White,  1964)  resulted  in  a  correlation  of  .77  between  the  fnctors.  (The 
chi-square  value  for  the  improvement  in  fit  achieved  by  adding  a  second 
factor  was  78,  with  41  degrees  of  freedom.    If  a  design  effect  correction 
is  incorporated  [see  Bock,  Gibbons,  &  Muraki,  1985;  Felleggi,  1979]  based 
on  the  mean  design  effect  for  Grade  8  [see  Chapter  14.2],  the  second  factor 
narrowly  misses  statistical  significance  at  a  =  .05.)    In  the  single  factor 
solution,  reading  comprehension  items,  particularly  those  that  involved 
fictional  stories,  tended  to  have  the  highest  factor  loadings.    Life  skills 
iteips  had  the  lowest  loadings. 


264 


10.1.3.5    Rosenbaum^s  Test  of  Unidimensionality,  Monotonicity,  and 


Conditional  Independence 


Rosenbaum  (1984a)  proves  a  theorem  that  stales  that  if  item 
characteristic  curves  are  nondecreasing  functions  of  a  single  latent 
variable,  then  conditional  (local)  independence  of  item  responses,  given 
the  latent  variable,  implies  certain  relations  among  the  item  responses. 
Specifically,  the  conditional  covariances  between  all  monotone  increasing 
functions  of  a  set  of  item  responses,  given  any  function  of  the  remaining 
item  responses,  will  be  non-negative.    This  theorem  can  be  used  to  develop 
statistical  tests  of  whether  an  observed  data  set  is  consistent  with  the 
assumptions  of  monotonicity,  unid imensionali ty,  and  conditional 
independence.    (See  Holland,  1981;  Holland  &  Rosenbaum,  in  press,  and 
Stout,  1984,  for  further  discussion  of  tests  of  this  kind.) 

As  a  special  case  of  Rosenbaun's  theorem,  we  can  test  the  partial 
association  for  each  pair  of  items,  given  number-right  score  on  the 
remaining  items,  using  the  Mantel-Haenszel  (1959)  test,  a  conventional 
procedure  for  analysis  of  discrete  data.    In  this  case,  we  are  examining 
the  conditional  covariance  between  monotone  item  summaries  which  are  simply 
responses  to  a  single  item.    The  function  on  which  we  are  conditioning  is 
the  number-right  score  on  the  rerraining  n  -  2  items.    To  perform  the 
Mantel-Haenszel  test  for  a  particular  item  pair,  a  2  x  2  table  of  item 
responses  is  constructed  for  each  of  the  K  possible  values  of  number-right 
score  0{i  the  remaining  items.    Let  n^      be  the  observed  count  in  the  i 
row,  j      column  and  k      table,  where  i  =  1,  0;  j  =  1,  0;  and  k  =  1,  2,.  .s. 
The  Mantel-Haenszel  test  statistic  is  given  by 


where  ECn^^^)  and  V(n  )  denote  the  hypergeometric  expectation  and 
variance  or  n(^^^),  given  by 


K  n,  .  n  , , 

^          „  l+k+lk 

=    2   ^   [15] 

k=l  "++k 


K 
k=l 


^^  +  k    "o  +  k  "+lk"+Ok 


•+  +  k 


(n 


+  +  k 


1) 


[16] 


and  the  plus  subscript  indicates  summation  over  that  subscript.  Tie 
approximate  significance  level  is  obtained  by  referring  Z  to  the  lower  tail 
of  the  standard  norir^al  distribution.    A  statistically  significant  result 
indicates  that  the  pair  of  items  has  a  negative  partial  asso?iation  and  is 
thus  inconsistent  with  the  hypothesized  class  of  models. 


265 


2 


The  Mantel-Haenszel  approach  was  programmed  to  accommodnre  the 
complexities  of  BIB  spiralling  in  the  following  way:    Suppose  that  we  are 
interested  in  assessing  the  conditional  covariance  between  items       and  X 
and  that,  because  of  BIB  spiralling,  certain  students  who  received  items 
and  X,  also  received  X3,  X^ ,  and  X5 ,  whereas  others  received  X^  and  X  . 
The  test  of  association  between  X^  and  X    is  then  based  on  seven  2x2 
tables:     four  corresponding  to  the  possible  score  values  for  X3  +  X^  +  X^ 
and  three  for  the  possible  scores  for  X^  +      .    Because  of  the  spiralling 
method  used  to  assign  booklets  to  respondents  (see  Section  10.1.3.1.2),  the 
face  that  respondents  did  not  all  receive  the  same  items  or  even  the  same 
number  of  items  does  not  impair  the  validity  of  the  method.    Items  that 
were  omitted  or  were  administered  but  not  reached  (see  Section  10.1.3.1.2) 
were  scored  as  incorrect. 

Because  of  the  cost  of  computations,  the  Rosenbaum  method  was  applied 
to  only  a  subset  of  the  NAEP  items:     those  in  bloc.J  H,  K,  M,  N,  and  0. 
The  numb,  r  of  items  per  grade/age  was  56  for  Grade  4/Age  9,  53  for  Grade 
8/Age  :3,  and  56  for  Grade  11/Age  17.    The  number  of  hypothesis  tests, 
which  is  equal  to  the  number  of  item  pairs,  was  1540,  1378,  and  1540  for 
grade/ages  4/9,  8/13,  and  11/17,  respectively.    To  evaluate  the  findings  of 
this  method,  a  decision  must  be  made  about  the  appropriate  alpha  level  at 
which  to  test  these  multiple  hypotheses.    Whertas  on  one  hand,  we  would 
like  to  cont  ol  the  overall  Type  I  error  rate  at  an  acceptable  level,  we  do 
not  want  to  maintain  such  rigorous  Type  I  error  control  that  a  rejection  of 
the  hypothesis  of  unidimensionality  woi'ld  be  impossible.    As  it  turns  out, 
even  if  the  alpha  for  each  hypothesis  test  is  set  at  .01,  a  liberal  alpha 
level  for  so  large  a  number  of  tests,  the  number  of  statistically 
significant  negative  partial  associations  is  only  4  for  Grade  4/Age  9,  4 
for  Grade  8/Age  13,  and  6  for  Grade  11/Age  17.  If  alpha  is  set  at  .05  for 
each  test,  the  number  of  statistically  significant  lesults  is  31,  29,  and 
26  for  the  three  grade/ages,  respec^vely  (see  Table  10.1(0)).     (It  may  at 
first  seem  surprising  that  less  than  100a  percent  of  the  item  pairs  had 
statistically  significant  negative  partial  associations.    However,  note 
that  we  would  expect  to  find  100a  percent  to  be  significant  if  all  Uie 
conditional  covariances  were  equal  to  zero  in  the  population.    If  they  are, 
in  fact,  greater  than  zero,  less  than  100a  percent  are  expected  to  be 
significantly  negative.)    Therefore,  it  is  reasonable  to  retain  the 
hypothesis  that  the  item  responses  can  be  represented  by  c.  monotonic 
unidimensional  latent  variable  model  with  conditional  independence.  It 
should  be  noted  that  application  of  the  Rosenbaum  method  does  not  provide  a 
test  of  the  fit  of  the  three-parameter  logistic  model  or  of  any  other 
specific  m'^del. 

In  applying  the  Rosenbaum  method,  no  modifications  were  incorporated  to 
reflect  NAEP's  complex  multi-stage  cluster  sampling  scheme  (Lago,  Burke, 
Tepping,  &  Hansen,  1985).    That  is,  raw  rather  than  weighted  frequencies 
were  used  in  the  analysis  and  no  jackknifing  or  design  effect  adjustment 
vas  used  in  computing  the  significance  probabilities  of  the  Mantel-Haenszel 
statistics.    As  noted  in  Section  10.1.3.2,  weighted  and  unweighted 
correlation  matrices  for  the  NAEP  data  are  virtually  identical,  suggesting 
that  the  weights  would  make  little  difference  in  the  Rosenbaum  analyses. 


266 


ERIC 


284 


Table  10.1(6) 
Results  of  Rosenbaum  Analyses 

Wi  thin-Grade/Age  Analyses 


Grade/Age 

4/9  8/13  11/17 

Number  of  items                              56  53  56 

Number  of  item  pairs                   1540  1378  1540 
Number  of  significant 

negative  partial  associations: 

a    =    .01  per  comparison               4  4  6 

a    =    .05  per  comparison             31  29  26 


Across-Grade/Age  Analyses 


Grade/Age  Pair 
4/9  &  8/13    4/9  &  11/17    8/13  &  11/17 

Number  of  comparisons  24  24  24 

Number  of  significant 
negative  partial  associations: 
a  =  .05  per  comparison  0  0  0 


267 


Furthermcre,  the  design  effect  for  these  tests  is  likely  to  be  greater  than 
one,  as  in  10.1.3.4.  Adjustment  of  the  significance  tests  would  then  lead 
to  a  reduction  in  the  number  of  item  pairs  found  to  have  negative  partial 
associations,  thus  reinforcing  the  original  conclusion  about 
dimensionali  ty. 


10.1.3.5.1    Acicos.-^-Grade/Age  Analyses 

In  addition  to  determining  whether  it  was  reasonable  to  regard  the 
reading  items  as  unidimensional  vithin  each  grade/age,  it  was  of  interest 
to  investigate  whether  unidimensionali ty  would  hold  if  respondents  from  all 
three  grade/ages  were  included.    Of  the  entire  set  of  items  available  for 
dimensionality  analyses  (Table  10.1(1)),  25  were  administered  to  all  three 
grade/ages.    Twenty-four  of  these  25  were  in  the  iter.i  blocks  (H,  K,  M,  N, 
0)  used  for  the  Rosenbaum  analyses.  A  method  developed  by  Rosenbaum 
(1984b),  which  is  a  variant  of  the  approach  described  above,  was  applied  to 
these  24  items.  The  procedi re  provides  a  test  of  whether  tie  item  responses 
of  two  groups  of  examinees  is  consistent  with  a  difference  in  the 
distribution  of  a  unidimensional  latent  variable.    A  rejection  of  this 
hypothesis  may  indicate  the  existence  of  additional  dimensions.    As  a  first 
step  in  the  analysis,  aa  indicator  variable  is  created  to  represent  group 
membership,  wi."  the  higher  value  associated  with  the  group  hypothesized  to 
have  generally  higher  values  on  the  latent  variable.    If  the  pattern  of 
item  responses  is  consistent  with  the  hypothesized  model,  the  conditional 
covariances  of  each  item  with  the  indicator  variable  will  be  non-negati>;e, 
as  described  in  10.1.3.5. 

For  the  NAEP  data,  a  separate  analysis  was  conducted  for  each  pair  of 
)?rade/ages,  as  follows;    An  indicator  variable  representing  grade/age  was 
created,  with  a  vJilue  of  1  indicating  the  higher  grade/age  and  the  value  of 
0  corresponding  to  the  lower  grade/age.    The  partial  association  of  each  of 
the  24  items  with  grade/age  was  then  assessed,  using  the  Mantel-Haenszel 
(1959)  rest,  as  described  in  10.1.3.5.    With  an  alpha  of  .05  for  each  of 
the  24  hypothesis  tests  per  grade/age  pair  (see  Table  10.1(6)),  no 
significant  negative  partial  associations  of  items  with  the  dummy-coded 
grade/age  variable  were  found.    This  means  that,  as  we  would  expect 
intuitively,  students  in  highe*^  jrade/ag'^s  were  more  likely  than  students 
in  lower  grade/ages  to  answer  ..ems  correctly,  conditional  on  nu.nber-right 
score  on  the  remaining  items.    These  results  are  consistent  with 
unidimensionali ty  of  the  item  pool. 


10.1.4    The  Impact  of  BIB  Spiralling  on  Dimensionality  Analyses 

As  noted  above,  the  missing  data  that  results  from  the  BIB  design  can 
be  regarded  as  random.  However,  this  in  itself  does  not  imply  that  the 
results  of  NAEP  data  analyses  are  unaffected  by  BIB  spiralling.    In  this 
section,  the  impact  of  BIB  spiralling  on  each  of  the  NAEP  dimensionality 
analyses  is  considered. 


268 


The  principal  components  analyses  of  the  phi,  tetrachoric,  and  image 
correlation  matrices  make  use  of  BIB  correlation  matrices.    These  matrices 
have  several  properties  that  distinguish  them  from  conventional  correlation 
matrices.    For  example,  the  standard  error?  of  the  wi thin-block 
correlations  are  smaller  than  those  of  the  betveen-block  correlations. 
Also,  BIB  matrices  may  have  negative  eigenvalues,  unlike  conventional 
Pearson  correlation  matrices. 

To  investigate  the  properties  of  BIB  matrices,  a  series  of  simulation 
studies  was  conducted,  one  of  which  is  reported  here.    Unidlmensional  item 
responses  for  1,000  "subjects"  on  30  items  were  generated  using  the 
procedures  described  in  Section  10.1.3.3.    The  first  10  items  were 
arbitrarily  designated  as  block  A,  the  second  10  as  block  B,  and  the  third 
10  as  block  C.    Two  correlation  matrices  were  then  computed.    The  first  was 
an  ordinary  phi  matrix,  computed  using  the  complete  matrix  of  item 
responses.    The  second  was  computed  by  censoring  the  item  responses 
according  to  a  BIB  design  in  which  the  first  333  examinees  received  blocks 
A  and  B,  the  next  333  received  blocks  B  and  C,  and  the  remaining  334 
examinees  received  blocks  A  and  C.    Pairwise  correlations  between  all  items 
were  then  computed.    Table  10.1(7)  shows  which  subjects  were  available  to 
estimate  the  within-and  across  block  correlations  in  the  BIB  matrix.  For 
example,  the  correlations  between  items  within  block  A  were  estimated  using 
examinees  1-333  (who  received  blocks  A  and  B)  and  667-1,000  (who  received 
blocks  A  and  C).    The  correlations  of  items  in  block  A  with  those  in  block 
B  were  estimated  using  examinees  1-333  only  because  no  other  subjects 
received  both  of  these  blocks. 

One  way  to  compare  these  two  correlation  matrices  is  in  terms  of  their 
residual  matrix,  computed  by  subtracting  the  BIB  matrix  from  the  complete 
data  matrix.    Table  10.1(8)  gives  the  lower  quartile,  median,  and  upper 
quartile  of  the  distributions  of  several  different  types  of  residuals.  The 
first  two  lines  apply  to  the  30(29)/2  =  435  distinct  off-diagonal  elements 
of  the  residual  matrix.    Descriptive  statistics  are  given  for  residuals 
(r^)  and  for  absolute  residuals  (|rj).  The  residuals  were  centered  around 
zero;  fifty  percent  of  them  were  between  -.02  and  +.02.    The  median 
absolute  residual  was  .02;  fifty  percent  of  the  absolute  residuals  were 
between  .01  and  .04.    The  next  two  lines  of  Table  10.1(8)  give  the 
analogous  information  for  residuals  corresponding  to  wi thin-block 
correlations  (i.e.,  the  diagonal  blocks  of  Table  10.1(7));  the  last  two 
lines  pertain  to  the  residuals  correponding  to  across-block  correlations 
(i.e.,  the  off-diagonal  blocks  of  Table  10.1(7)).    Because  the  wi thin-block 
correlations  were  based  on  twice  as  many  examine-s  as  the  across-block 
correlations,  within-block  residuals  were  smaller  in  absolute  value. 

Table  10.1(9)  gives  a  nartial  comparison  of  the  eigenstructures  of  the 
two  correlation  matrices.    The  lefthand  side  of  the  table  shows  the  first 
ten  eigenvalues  of  the  two  matrices;  the  righthand  side  gives  ten  el'^ments 
of  the  first  two  normalized  eigenvectors.    Clearly,  these  eigenvalues  and 
eigenvectors  were  very  similar  for  the  two  matrices.    Although  subsequent 
eigenvectors  were  more  discrepant,  application  of  conventional 


269 


2H1 


Table  10.1(7) 


Subjects  Available  to  Estimate  Within-  and 
Across-Block  Correlations  for  30-itein 
BIB  Simulation* 


Block  A                        Block  B  Block  C 

(Items  1-10)  (Items  11-20)  (Items  21-30) 

A    667  Ss  (1-333,667-1000) 

B    333  Ss  (1-333)  666  Ss  (1-666) 

C    334  Ss  (667-100)  333  Ss  (334-666)  667  Ss  (334-1000) 


*The  table  gives  the  number  of  subjects  (Ss)  available  to  estimate  the 
correlations  in  each  block  of  the  matrix.  The  sequence  numbers  of  the 
subjects  are  given  in  parentheses. 


270 


288 


Table  10.1(8) 


Distribution  of  Residual  Correlations 
for  30-Item  BIB  and  Complete  Data  Simulations* 


Lower  Upper 
Quartile    Median  Quartile 


Full  residual  matrix  (435  elements) 
Residuals 

Absolute  residuals 


-.0236 
.0115 


-.0002 
.0230 


.0206 
.0425 


Within-block  residual  correlations 
(135  elements) 
Residuals 

Absolute  residuals 


-.0165 
.0076 


-.000" 
.0142 


.0135 
.0219 


Across-block  residual  correlations 
(300  elements) 
Residuals 

Absolute  residuals 


-.0299 
.0159 


-.0001 
.0292 


.0286 
.0518 


♦Elements  of  the  BIB  matrix  were  subtracted  from  elements  of  the 
complete  data  matrix.    Descriptive  statistics  were  computed  for: 

(1)  the  30(29)/2  =  435  distinct  off-diagonal  elements  of  the 
residual  matrix; 

(2)  the  3(10(9)/2]  =  135  distinct  within-block  residual 
correlations:  and 

(3)  the  3(10  )  =  300  across-block  residual  correlations. 


271 


Table  10, 1(9) 


Partial  Comparison  of  Eigenstructure  of 
BIB  and  Complete  Data  Correla^^ion  Matrices 
for  30-Item  Simulation 


First  Ten 
Eigenvalues 

Complete  Incomplete 


Ten  Elements  of 
Normalized  Eigenvectors 


First 
Eigenvector 


Second 
Eigenvector 


Complete  Incomplete        Complete  Incomplete 


7.55 

7.48 

.20 

.19 

-.34 

-.28 

1.82 

1.87 

.22 

.21 

-.15 

-.13 

1.06 

1.29 

.17 

.15 

.11 

.14 

1.03 

1.21 

.21 

.19 

.07 

.07 

1.01 

1.10 

.10 

.10 

.19 

.21 

0.98 

1.08 

.16 

.16 

-.14 

-.11 

0.90 

1.03 

.14 

.14 

.25 

.27 

0.89 

0.98 

.19 

.18 

-.11 

-.12 

0.84 

0.90 

.18 

.17 

.24 

.22 

0.81 

0.89 

.21 

.21 

-.09 

-.06 

272 


ERIC 


2i>u 


factor-analytic  methodology  to  the  BIB  matrix  would  probably  lead  to 
conclusions  that  did  not  differ  substantially  from  those  obtained  using  the 
complete  data  correlation  matrix.    It  is  important  to  note  however,  that 
theoretical  work  is  needed  to  fully  understand  the  statistical  properties 
of  BIB  matrices. 

An  important  property  of  the  full-information  factor  analysis  and  the 
Mantel-Haenszel  approach  is  that  they  do  not  require  the  computation  of  the 
inter-item  correlation  matrix-    That  is,  estimation  of  the  parameters  of 
interest  in  these  models  (factor  loadings  and  item  thresholds  in 
full-information  factor  analysis,  conditional  odds  ratios  in  the 
Mantel-Haenszel  method)  does  not  require  an  estimate  of  the  population 
correlation  matrix.    The  full-information  factor  analysis  operates  on  the 
set  of  distinct  vectors  of  item  responses;  the  Mantel-Haenszel  approach 
involves  consideration  of  the  pairwise  relations  between  items.    In  neither 
case  does  the  model  theory  dictate  that  the  item  response  matrix  be 
complete.    This  is  a  distinct  advantage  for  NAEP  applications. 
Essentially,  the  effect  of  the  BIB  missing  data  pattern  on  these  analyses 
IS  that  some  parameters  are  estimated  with  greater  precision  than  others. 
This  uneven  precision  is  unlikely  to  have  a  major  effect  on  conclusions 
about  dimensionality. 


10.1.5  Conclusions 

Overall,  the  four  dimensionality  analyses  of  the  NAEP  reading  items 
indicate  that  it  is  not  unreasonable  to  treat  the  data  as  unidimensional. 
As  a  preliminary  approach,  principal  component  analyses  of  phi  and 
tetrachoric  correlation  matrices  were  computed  for  each  of  the  three 
grade/ages  and  for  the  25  across-grade/age  items.    The  first  roots  obtained 
from  these  analyses  were  sizable,  ranging  from  17  to  25  percent  of  the 
trace  for  the  phi  matrices  and  30  to  40  percent  for  the  tetrachoric 
matrices.     (For  simulated  unidimensional  data,  the  first  root  of  the  phi 
matrix  typically  constituted  25  to  30  percent  of  the  trace.) 

As  an  experimental  method,  a  factor-analytic  approach  based  on 
Guttman's  image  theory  was  also  applied.    Principal  component  analysis  of 
the  image  correlation  matrices  yielded  larger  first  roots  than  PCA  of  the 
corresponding  phi  matrices,  but  larger  second  roots  as  well.    However,  both 
theoretical  and  empirical  examinations  of  this  method  indicate  that  the 
image  approach  does  not  avoid  the  artifacts  associated  with  the  application 
of  linear  factor-analytic  methods  to  dichotomous  data. 

Application  of  full-information  factor  analysis,  a  method  developed  by 
Bock  and  his  associates,  to  a  subset  of  the  Grade  8/Age  13  data  led  to  a 
satisfactory  fit  with  a  one-factor  model.    The  first  factor  accounted  for 
39  percent  of  the  total  variance.    Reading  comprehension  items  involving 
fictional  stories  had  the  highest  loadings  on  this  factor;  life  skills 
items  had  the  lowest. 

Finally,  the  Mantel-Haenszel  approach  developed  by  Rosenbaum  led  to  a 
retention  of  the  hypothesis  that  the  data  can  be  represented  by  a 


273 


unidlmensional  latent  variable  model  with  conditional  independence.  In 
addition  to  analyses  within  .ach  grade/age,  tests  were  conducted  to 
determine  whether  data  for  each  pair  of  grade/ages  were  consistent  with  a 
difference  in  distribution  of  a  unidimensional  latent  variable.    Again,  the 
hypothesis  of  unidimensionality  was  retained. 

Although  categorization  of  the  NAEP  reading  items  is  useful  for  test 
development  and  reading  research,  the  dimensionality  analyses  reported  here 
do  not  provide  strong  empirical  evidence  for  the  existence  of  multiple 
dimensions.    Especially  when  considered  in  light  of  the  robustness  research 
discussed  in  Section  10.1.1.1,  the  results  do  not  contraindicate  the 
application  of  unidimens:onal  item  response  theory  models  to  the  reading 
data. 


274 


ERiC  ^^'^ 


Appendix  1 


Items  Used  in  Dimensionality  Analyses 


This  appendix  lists,  for  each  grade/age,  the  items  used  ir  the  NAEP 
dimensionality  analyses,    items  are  listed  by  NAEP  ID  and  by  booklet 
location.     (Note  that  the  NAEP  ID  uniquely  identifies  an  item.  However, 
the  booklet  location  for  an  item  may  differ  across  grade/ages.)  The 
dimensionality  analyses  were  given  the  codes  A-E  in  this  appendix.  The 
following  key  explains  these  codes  and  indicates  which  section  of  the 
report  contains  an  explanation  of  the  analyses. 

A.  Component  analysis  and  image  analysis  -  within  grade/age  (see 
Sections  10.1.3.2  and  10.1.3.3) 

B.  Component  analysis  and  image  analysis  -  across  grade/aee  (see 
Sections  10.1.3.2  and  10.1.3.3) 

C.  Full-information  factor  analysis  (see  Section  10.1.3.4) 

D.  Rosenbaum  method  -  within  grade/age  (see  Section  10.1.3.5) 

E.  Rosenbaum  method  -  across  grade/age  (see  Section  10.1.3.5.1) 


275 


ITEMS  USED  IN  DIMENSIONALITY  ANALYSIS  -  Grade  A/Age  9 


ANALYSES 

NAEP  ID  BOOKLET  LOCATION     A  B  C  D  E 


1 

X  • 

MO  mini 

N  U  U  X  X  U  X 

u  u  □ 

V 

Y 

A 

z  • 

wn  n  1  t^n  1 

H— m  n 

V 
/V 

Y 

A 

X 

A 

X 

•J 

un  n  1  J^n  0 

II— m  1 

V 

V 
«v 

X 

X 

A 

H  • 

MO  n  1  t^n  '3 

NU  U 1 D  U  J 

H— m  0 

u  X  ^ 

V 

X 

A 

X 

X 

C 

D  • 

M  n  n  1 n  it 

NU  U 1 DU  4 

n~  u  X  0 

V 
A 

X 

A 

X 

A 

X 

0  • 

NU  U 1 DU  0 

H— m  ^ 
n*"  u  X  J 

V 
A 

X 

A 

X 

A 

X 

f  • 

mh  n  1   n  1 

MU  U ID  U 1 

T— m  0 

U     U  X  ^ 

V 
A 

Q 

0  • 

mh  n  1   n  0 

NU  U ID  U  Z 

U  —  U  X  0 

V 
A 

Q 

7  . 

mh  n  1  A  n  '3 

N  U  U  1  D  U  J 

U     U  X  •1 

V 
A 

1  0 

mo  n 1  A  n  A 

iM  U  U  1 D  U  4 

u  "~  V  X  J 

V 
A 

1 1  • 

N  u  u  1 0  u  z 

.T_non 

u  ~  u  ^  u 

V 
A 

1  *7 
IZ  • 

mh  n  0  n  n 1 
N  u  u  zu  u  1 

ir^n  no 
rv—  u  u  7 

V 
A 

X 

A 

X 

X 

lo  • 

N  u  u  z  u  u  z 

K— m  n 

r\~  u  X  u 

V 
A 

X 

A 

X 

X 

14  • 

NU  u  Z  U  U  J 

K—  mi 

fx**  U  X  X 

V 
A 

X 

A 

X 

X 

mh  n  0 1 n 1 
NU  u  z 1 u 1 

if_ni  p 
rv"  u  X  0 

Y 

A 

X 

A 

X 

A 

X 

10  • 

N  u  u  z  1  u  z 

v — n  1  Q 
rv""  u  X  J 

Y 

A 

X 

A 

X 

X 

LI. 

Kin  n  0  /!  n  1 
NU  u  z  4  u 1 

T  — no  0 

Li"  U  ^  ^ 

Y 

A 

10  • 

Kin  n  0  "7  n  0 
NU  u  z  /  u  z 

T  — non 

Li"  U  ^  U 

Y 

A 

1  Q 
I7  • 

Mn  n  0  p  n 1 

N  u  u  z  0  U  X 

T  -nod 

Li     V  ^  4 

Y 

A 

Mn  n  OP  n  o 

N  U  U  Z  0  U  Z 

V  ^  «^ 

Y 

A 

Zl  • 

Mn  n  0  p  n  7 

N  U  U  Z  0  U  0 

T  -006 

1j  "  V  ^  V 

Y 

A 

zz  • 

Mn  n  "3  n  n  1 
N  u  u  0  u  u  1 

M-m  n 

X 

A 

X 

X 

zo  • 

Mnn  "^nno 

N  U  U  0  U  U  ^ 

M— n  1 1 

X 

X 

X 

X 

zh  • 

Mn  n  '3n  n  "3 

N  u  u  0  U  U  D 

M— m  0 

X 

X 

A 

X 

X 

ZD  • 

Mn  n  "3 1  n  1 
N  u  u  0 1  u  1 

M— m  A, 

V  X  "A 

X 

A 

X 

A 

X 

X 

ZO  • 

Mn  n  "3 1  n  0 

N  U  U  0  X  U  Z 

M-m  R 

X 

X 

A 

X 

X 

Z  /  • 

Mn  n  "3 1  n  "3 
N  u  u  0 1  u  0 

M-m  fi 

X 

X 

X 

X 

ZO  • 

Mn  n  "37  n  1 
N  u  u  0  /  u  1 

M— no  "3 

N  **  U  ^  0 

Y 

A 

X 

A 

X 

X 

Z7  • 

Mn  n  "3  7  n  0 
N  U  U  J  /  u  z 

M-nod 

X 

A 

X 

X 

X 

Mn  n  "37  n  "3 
N  u  u  0  /  u  0 

M-noR 

X 

A 

X 

X 

X 

0 1  • 

Mn  n  "3  p  n  1 
N  u  u  0  0  u  1 

o-m  0 

X 

A 

X 

A 

X 

X 

0  Z  • 

Mn  n  "3  p  n  0 
N  u  u  0  0  u  z 

o-m  "3 

X 

A 

X 

A 

X 

X 

Mn  n  '3  p  n  '3 

N  U  U  J  0  U  0 

O-m  d 

X 

A 

X 

X 

X 

^  4  • 

Mn  n  Aim 

N  U  U  1  X  U  X 

O-017 

X 

X 

35. 

N004201 

O-018 

X 

X 

X 

X 

36. 

N004202 

O-019 

X 

X 

X 

X 

37. 

N004401 

P-007 

X 

38. 

N004402 

P-008 

X 

39. 

N004403 

P-009 

X 

40. 

N004701 

Q-OlO 

X 

41. 

N004702 

Q-011 

X 

42. 

N004703 

Q-012 

X 

43. 

N004801 

Q-013 

X 

44. 

N004901 

Q-014 

X 

X 

276 


ERIC 


ITEMS  USED  IN  DIMENSIONALITY  ANALYSIS  -  Grade  4/Age  9 


ANALYSES 

NAEP  ID  BOOKLET  LOCATION    A  B  C  D 


m«A  A  C  i  A  i 

NOOblOl 

f\     A  i  If 

Q-015 

X 

46  • 

M  A  A  A  £  A  i 

N008601 

•«      A  A  ^ 

H-006 

X 

X 

47  • 

A  A  A  £  A 

N008602 

•«      A  A  ^ 

H-007 

X 

X 

48  • 

N008603 

•«  AAA 

H-008 

X 

X 

49 . 

m«  A  A  A  ^  A  4 

N008701 

•«  AAA 

H-009 

X 

X 

50  • 

m«  A  A  A  A  A  4 

N008801 

J-018 

X 

51  • 

m«A  A  A  A  A  4 

N008901 

J-021 

X 

52  • 

mwAAAAAA 

N008902 

J-022 

X 

53  • 

N008904 

J-024 

X 

54 . 

A  A  A  A  A  4 

N009001 

K-012 

X 

X 

55  • 

N009002 

K-013 

X 

X 

56 . 

N009003 

K-014 

X 

X 

57  • 

_  _  A  A  A  A  A  ^ 

N009004 

K-015 

X 

X 

58  • 

m«  A  A  A  4    A  4 

N009101 

K-016 

X 

X 

59 . 

A  A  A  A  A  4 

N009201 

K-017 

X 

X 

60  • 

m«  AAA  ^  A  4 

N009401 

L-023 

X 

61  • 

mwAAA^  A4 

N009601 

«         A  A  4 

L-021 

X 

62  • 

m«  A  A  A  ^  A  4 

N00S701 

mm        A  A  P 

M-005 

X 

X 

63  • 

A  A  A  ^  A  A 

N009702 

M      A  A  ^ 

M-006 

X 

X 

64  • 

m«  A  A  A  ^  A  ^ 

N009703 

M-007 

X 

X 

65  • 

»«A  A  A  ^  A  4 

N009704 

M-008 

X 

X 

66  • 

A  A  A  ^  A  P 

N009705 

mm  AAA 

M-009 

X 

X 

67  • 

A  A  A  A  A  4 

N009801 

N-012 

X 

X 

68  • 

A  A  A  A  A  4 

N009901 

N-013 

X 

X 

69. 

m«  A  4  A  A  A  A 

N010002 

N-018 

X 

X 

70 . 

m«  A  4  A  A  A  ^ 

N010003 

N-019 

X 

X 

71 . 

m«A  4   A  4   A  A 

N010102 

N-021 

X 

X 

72 . 

K«  A  4    A  4    A  ^ 

N010103 

N-022 

X 

X 

73. 

N010201 

O-016 

X 

X 

74 . 

m«  A  4    A   ^  A  4 

NOlO  301 

O-015 

X 

X 

75 . 

m«  A  4    A    A  A  4 

N010401 

O-020 

X 

X 

76 . 

m«  A  4    A  AAA 

N010402 

O-021 

X 

X 

77 . 

A  4  A  ^  A 

N010403 

O-022 

X 

X 

78 . 

m«  A  4   A  ^  A  4 

NOlOSOl 

«*      A  4  A 

P-010 

X 

/9 . 

M  n  1  n  c  n  1 
NUlUbUZ 

n  Ail 

r— Ui  1 

A 

80. 

N010503 

P-012 

X 

81. 

N010504 

P-013 

X 

82. 

N010601 

P-014 

X 

83. 

N010602 

P-015 

X 

84. 

N010603 

P-016 

X 

85. 

N010604 

P-017 

X 

86. 

N010605 

P-018 

X 

87. 

N010701 

P-019 

X 

88. 

N010801 

Q-016 

X 

89. 

N010902 

Q-018 

X 

277 


ERIC 


ITEKS  USED  IN  DIMENSIONALITY  ANALYSIS  -  Grade  4 /Age  9 


ANALYSES 


NAEP  ID 

BOOKLET  LOCATION 

A 

90 

N010903 

O-019 

X 

N010904 

Q-020 

X 

92 

NOllOOl 

R-005 

X 

N011002 

W  X  X  V  V  M 

R->006 

X 

N011003 

&^  W  X  X  W  W  w 

R-007 

X 

NOl 1004 

R-008 

A\  WWW 

X 

NOlllOl 

w  X  X  X  V  X 

R-009 

A\      W  W  «# 

X 

N011201 

W  X  X  M  W  X 

R-OlO 

X 

N011301 

4^  W  X  X  ^  W  X 

R-011 

X 

N011302 

R-012 

X 

100 

N011401 

4^  W  X  X  ^  W  X 

R-013 

X 

101 . 

N011402 

R-014 

X 

102. 

N011403 

R-015 

X 

103. 

N011404 

R-016 

X 

104. 

N014001 

M-013 

X 

105. 

N014101 

Q-021 

X 

106. 

N014301 

N-014 

X 

107. 

N014302 

N-015 

X 

108. 

N014303 

N-016 

X 

278 

ERIC 


ITEMS  USED  IN  DIMENSIONALITY  ANALYSIS  -   Grade  8/Age  13 


ANALYSES 


NAPP  rn 

mxctiT  XL/ 

BOOKLET  LOCATION 

A 

B 

c 

D 

E 

1 

mO  01101 
U  X  X  U  X 

H-006 

X 

2 

NO  019  01 
nu  u  X  «  u  X 

H-007 

X 

X 

X 
X 

3 

mO  0 19  0  9 

M  U  U  X  ^  U  ^ 

H-008 

X 

X 

X 

4 

MOO  1^01 
nu  u  X  J  u  X 

H-009 

X 

X 

X 

5 

mOOI  ^^^9 
nu  u  X  ^  V  tL 

H-OlO 

X 

X 

X 

6 

NOn  1  "J 

•V  w  V  X  ^  o  ^ 

H-011 

X 

X 

X 

7 

MO  014  01 
nu  u  X  4  u  X 

H-012 

X 

X 

o 

o  • 

NU  U  X  ^Ui 

H-013 

X 

X 

X 

X 

Q 

^  • 

mO  01^09 
nu  u  X  3  u  z 

H-014 

X 

X 

X 

X 

X 

1  n 

NUU 1 bO  3 

H-015 

X 

X 

X 

X 

X 

mh  n  1  c  n  ii 

H-016 

X 

X 

X 

X 

X 

1  7 

Kin  0 1  c 

NU  U  X  D  JD 

H-018 

X 

X 

X 

X 

X 

mo  n  1 1«  0 1 

NU  U  X  D  U  X 

J-011 

X 

X 

X 

1  A 
X  s  • 

Kin  0 1   n  0 

NU U  XD U^ 

J-012 

X 

X 

15 

X  ^  • 

MH  0  1  It  0  0 
NU  U  X  0  U  3 

J-013 

X 

X 

16 

MO  0  1    0  4 
n  u  u  X  Q  u  *i 

J-014 

X 

X 

17 

X  /  • 

MOO  17  01 

U  U  X  /  U  X 

J-017 

X 

X 

18 

X  w  • 

MO  01709 
CVU  U  X  /  u  ^ 

J-018 

X 

X 

Mfi  n  1  7  n  "J 

NU  U  X  /  U  3 

J-019 

X 

X 

20 

MO  0  10  0  9 
NU  U  X  0  U  Z 

J-021 

X 

X 

21 

^  X  • 

MO  0 1  Q  O 1 
N  U  U  X  7  U  X 

J-022 

X 

X 

• 

KtO  O  1  O  O  0 
NUU  X7U  i 

J-024 

X 

X 

KlO  0  9  0  0  1 
NU  U  Z  U  U  X 

K-009 

X 

X 

X 

24 

MO  n  9  0  0  0 
NUU  zu  uz 

K-OlO 

X 

X 

X 

X 

25 

MO  0  9  0  0  ^ 
NU  U  ZU  U  5 

K-011 

X 

X 

X 

X 

26 

AO* 

MO  0  9  101 
NUU  Z  X  U  X 

K-012 

X 

X 

X 

X 

27 

MOO  910  9 
N  U  U  ^  X  U  ^ 

K-013 

X 

X 

X 

X 

28 

M00990 1 

n  u  J  ^  ^  u  X 

K-014 

X 

X 

X 

29 

MOO  9909 
n  u  u  ^  ^  u  ^ 

K-015 

X 

X 

30 

MO  0  9  9  0  "3 

K-016 

X 

X 

31 

MO 0  900  9 

M-006 

X 

X 

32 

MO  0  9  QO 

K^U  U  ^  7U  3 

M-007 

X 

X 

33 

MO  0  9  0  0^ 
NU  U  Z  7  U  4 

M-008 

X 

X 

34 

MO  0  9  00  C 
NU  U  Z  7  U  D 

M-009 

X 

X 

35 

MO  0  90n£ 
NU  U  uD 

M-OlO 

X 

X 

36 

MO  0  9  001 

N  U  U  J/  U  U  X 

H-011 

X 

X 

X 

37 

MO  0^009 

M-012 

X 

X 

X 

X 

38. 

N003003 

M-013 

V 
A 

Y 
A 

X 

X 

39. 

N003101 

M-014 

X 

X 

X 

X 
X 

40. 

N003102 

M-015 

X 

X 

X 

41. 

N003103 

M-016 

X 

X 

X 

X 

42. 

NO03201 

N-012 

X 

X 

X 

X 

43. 

N003202 

N-013 

X 

X 

X 

44. 

N003203 

N-014 

X 

X 

X 

45. 

N003204 

N-015 

X 

X 

X 

279 


ERIC 


ITCNS  USED  IN  DIMENSIONALITY  ANALYSIS  -  Grade  8/Age  13 


ANALYSES 


NAEP  ID 

BOOKLET  LOCATION 

A 

B 

c 

D 

E 

46. 

N003301 

N-016 

X 

X 

X 

47. 

N003401 

N-017 

X 

X 

X 

48. 

N003501 

N-018 

X 

X 

X 

49. 

N003601 

N-019 

X 

X 

X 

50. 

N003602 

N-020 

X 

X 

X 

51. 

N003701 

N-021 

X 

X 

X 

X 

X 

52. 

N003702 

N-022 

X 

X 

X 

X 

X 

53. 

N003703 

N-023 

X 

X 

X 

X 

X 

54. 

N003801 

O-012 

X 

X 

X 

X 

X 

55. 

N003802 

O-013 

X 

X 

X 

X 

X 

56. 

N003803 

O-014 

X 

X 

X 

X 

X 

57. 

N003901 

O-016 

X 

X 

X 

58. 

N004002 

O-015 

X 

X 

X 

59. 

N004101 

O-017 

X 

X 

X 

60. 

N004201 

O-018 

X 

X 

X 

X 

X 

61. 

N004202 

O-019 

X 

X 

X 

X 

X 

62. 

N004301 

O-020 

X 

X 

X 

63. 

N004302 

O-021 

X 

X 

X 

64. 

N004401 

P-007 

X 

65. 

N004402 

P-008 

X 

66. 

N004403 

P-009 

X 

67. 

N004501 

P-010 

X 

68. 

N004502 

P-011 

X 

69. 

N004601 

P-012 

X 

70. 

N004602 

P-013 

X 

71 

N004603 

P-014 

X 

72. 

N004604 

P-015 

X 

73. 

N004701 

Q-007 

X 

74. 

N004702 

Q-008 

X 

75. 

N004703 

Q-009 

X 

76. 

N004801 

Q-OlO 

X 

77. 

N004901 

Q-011 

X 

X 

78. 

N005001 

Q-013 

X 

79. 

N005002 

Q-014 

X 

80. 

N005003 

Q-015 

X 

81. 

N005101 

Q-012 

X 

82. 

N005201 

Q-016 

X 

83. 

N005202 

Q-017 

X 

84. 

N005203 

O-018 

X 

85. 

N005301 

Q-019 

X 

86. 

N005302 

Q-020 

X 

87. 

N005303 

Q-021 

X 

88. 

N005304 

Q-022 

X 

89. 

N00F305 

Q-023 

X 

90. 

N005403 

R-007 

X 

91. 

N005404 

R-ooe 

X 

280 


ERIC 


295 


ITEMS  USED  IN  DIMENSIONALITY  ANALYSIS  -    Grade  8/Age  12 

ANALYSES 


BOOKLET  LOCATION 

A 

92 

nu  V  3  ^  U  9 

X 

93. 

N005406 

U  X  u 

A 

94. 

N005407 

R-Oil 

X 

95. 

N005503 

R-014 

X 

96. 

N005504 

R-015 

X 

97. 

N0055C5 

R-016 

X 

98. 

N00560i 

R-017 

X 

99. 

N005602 

R-018 

X 

100. 

N005603 

R-019 

X 

281 


ITEMS 

USED  IN 

DIMENSIONALITY  ANALYSIS  - 

Grade  11/ Age  17 

ANALYSES 

NAEP  ID 

BOOKLET  LOCATION 

A 

BCD 

E 

1 . 

N001301 

H-OlO 

X 

X 

2  • 

N001302 

H-011 

X 

V 

A 

3. 

N001303 

H-012 

X 

X 

4. 

N001401 

H-013 

X 

X 

5. 

N001501 

H-014 

X 

X  X 

X 

6. 

KaA  A  «    ^  A  A 

N001502 

H-015 

X 

X  X 

X 

7. 

N001503 

H-016 

X 

X  X 

X 

8. 

KaA  A  «    ff*  A  ^ 

N001504 

H-017 

X 

X  X 

X 

9. 

KB  A  A  «    ff*  A  ^ 

N001506 

H-019 

X 

X  X 

X 

10. 

KM  A  A  .         A  4 

N001701 

J-012 

X 

11  • 

N001702 

J-013 

X 

12  • 

KM  A  A  .         A  ^ 

N001703 

J-014 

X 

13. 

A  A  .   A  A  4 

N001901 

J-015 

X 

14. 

N001903 

J-017 

X 

15. 

N002001 

K-009 

X 

X  X 

X 

16. 

N002002 

K-OlO 

X 

X  X 

X 

17. 

N002003 

K-011 

X 

X  X 

V 

X 

18. 

0^  0^  ^%  4     A  4 

N002101 

K-012 

X  X 

V 

X 

19. 

N002102 

K-013 

X 

X  X 

X 

20  • 

N002201 

K-014 

X 

X 

21  • 

KaA  AAAA  A 

N002202 

K-015 

X 

X 

22. 

A  A  A  A  A  ^ 

N002203 

K-016 

X 

X 

23  • 

ma  A  A  A  P  A  4 

N002501 

L-027 

X 

24  • 

KM  A  A  A       A  . 

N002701 

L-028 

X 

25. 

KM  A  A  A       A  A 

N002702 

L-029 

X 

26. 

KM  A  A  A  A  A  . 

N002801 

L-030 

X 

27  • 

KB  A  A  A  A  A  A 

N002802 

L-031 

X 

28. 

KB  A  A  A  A  A  A 

N002803 

L-032 

X 

29. 

N002002 

M-006 

X 

X 

30  • 

KB  AAA  A  A  A 

N002903 

M-007 

X 

X 

31  • 

0^  0^         0^  0^  M 

N002904 

M-003 

X 

V 

X 

32. 

N002905 

M-009 

X 

V 

X 

33. 

N002906 

M-OIO 

X 

X 

34. 

N003001 

M-011 

X 

X  X 

X 

35. 

KB  A  A  A  A  A  A 

N003002 

M-012 

X 

X  ^ 

V 

X 

36. 

KB  A  A  A  A  A 

N0030U3 

M-013 

X 

X  X 

V 

X 

37. 

A  A  A  4   A  4 

N003101 

M-014 

X 

X  X 

V 

A 

36. 

msA  A  A  «    A  A 

N003102 

M-015 

X 

X  X 

V 

X 

39. 

N003103 

M-016 

X 

X  X 

X 

40. 

N003201 

N-021 

X 

X 

41 . 

KB  0k.  A  A  A  A  A 

N003202 

N-022 

X 

X 

42. 

ma  A  A  A  A  A  A 

N003203 

N-023 

X 

X 

43. 

aaA  A  A  A  A  4 

N003204 

N-024 

X 

V 

X 

44. 

A*  A  A  A  A  A  4 

N003301 

N-025 

X 

V 

X 

45. 

ma  A  A  A  ff*  A  4 

N003501 

N-027 

X 

X 

46. 

A  A  ^  ^  A  4 

N003601 

N-028 

X 

A 

47. 

N003602 

N-029 

X 

X 

282 


ERIC 


ITEMS  USED  IN  DIMENSIONALITY  ANALYSIS  -  Grade  11/Age  17 


ANALYSES 

NAEP  ID  BOOKLET  LOCATION     A  B  C  D  E 


48. 

N003701 

N-030 

X 

X 

X 

X 

49. 

N003702 

N-031 

X 

X 

X 

X 

50. 

N003703 

N-032 

X 

X 

X 

X 

51. 

N003801 

O-012 

X 

X 

X 

X 

52. 

K003802 

O-013 

X 

X 

X 

X 

53. 

N003803 

O-014 

X 

X 

X 

X 

54. 

N004201 

O-021 

X 

X 

X 

X 

55. 

N004202 

O-022 

X 

X 

X 

X 

56. 

N004301 

O-023 

X 

X 

57. 

N004302 

O>024 

X 

X 

58. 

N004501 

P-020 

X 

59. 

N004502 

P-021 

X 

60. 

N004601 

P-022 

X 

61. 

N004602 

P-023 

X 

62. 

N004603 

P-024 

X 

63. 

N004604 

P-025 

X 

64. 

N004901 

Q-OlO 

X 

X 

65. 

N005001 

Q-007 

X 

66. 

N005002 

Q-008 

X 

67. 

N005003 

Q-009 

X 

68. 

N005201 

Q-011 

X 

69. 

N005202 

Q-012 

X 

70. 

N005203 

Q-013 

X 

71. 

N005503 

R-014 

X 

72. 

N005504 

R-015 

X 

73. 

N005505 

R-016 

X 

74. 

N015101 

R-017 

X 

75. 

N015102 

R-018 

X 

76. 

N015103 

R-019 

X 

77. 

N015104 

R-020 

X 

78. 

N015201 

N-026 

X 

X 

79. 

N015502 

P-016 

X 

80. 

N015503 

P-017 

X 

81. 

N015504 

P-018 

X 

82. 

N015505 

P-019 

X 

83. 

N015901 

Q-014 

X 

84. 

N015902 

Q-015 

X 

85. 

N015903 

Q-016 

X 

86. 

N015904 

Q-017 

X 

87. 

N016001 

O-015 

X 

X 

88. 

N016002 

O-016 

X 

X 

89. 

N016003 

O-017 

X 

X 

90. 

N016004 

O-018 

X 

X 

91. 

N016005 

O-019 

X 

X 

92. 

N016006 

O-020 

X 

X 

93. 

N017001 

H-007 

X 

X 

94. 

N017002 

H-008 

X 

X 

95. 

N017003 

H-009 

X 

X 

283 


o  3 'II 

ERIC 


Appendix  2 


A  Procedure  for  Obtaining  a  Gramian  Matrix  that  Approximates  a 
BIB  Correlation  Matrix  for  NAEP  Items 


(1)  Stait  with  the  weighted  (i.e.,  incorporating  sampling  weights)  BIB 
covariance  matrix. 

(2)  Substitute  zeroes  for  the  negative  eigenvalues.    (The  negative 
eigenvalues  constituted  4,  2,  and  2  percent  of  the  trace  of  the  missing 
data  covariance  matrix  for  grade/ages  4/9,  8/13,  and  11/17,  respectively. 
There  were  no  negative  eigenvalues  for  the  across-grade/age  matrix.) 

(3)  Now  obtain  the  "reconstructed"  covariance  matrix,  C* ,  using  the 
following  equation: 

C*  =  Q  D*  q'  , 

where  Q  is  the  matrix  of  normalized  eigenvectors  of  the  original  covariance 
matrix'and  D    is  a  diagonal  matrix  of  eigenvalues,  with  zeroes  substituted 
for  the  negative  eigenvalues.    C  ~=  Q  D  '  Q'  is  the  pseudo-inverse  of  C  , 
where  the  elements  of  D  '    are  the  reciprjcals  of  the  corresponding 
elements  of  D    for  positive  elements  of  D    and  zeroes  for  zero  elements 
of  D  . 

(4)  It  is  now  possible  to  obtain  a  reconstructed  correlation  matrix,  ^ 
R*,  corresponding  to  C  ,  using  ordinary  methods.    The  pseudo-inverse  of  R 
can  be  obtained  as  follows: 

R*"  =  S  C*'S, 

where  S  is  a  diagonal  matrix  of  the  square  roots  of  the  diagonal  elements 
of 

It  is  desirable  to  begin  with  the  covariance  matrix  in  Step  1  because 
operating  on  the  correlation  matrix,  R,  directly  will  produce  a 
reconstructed  R  that  does  not  have  ones  on  the  diagonal. 

The  medians  of  the  residuals  obtained  by  subtracting  elements  of  R* 
from  elements  of  the  original  R  were  .007,  .002,  and  .003  for  grade/ages 
4/9,  8/13,  and  11/17,  respectively.    In  addition,  the  eigenstructures  for 
R*  matrices  were  very  similar  to  those  for  the  original  R's.    The  method  is 
inexpensive  and  is  not  difficult  to  program.    An  alternative  method  of  B. 
Wingersky  (1984)  produced  smaller  residuals,  but  was  prohibitively 
expensive  to  execute. 


284 


Chapter  10.2 
JOINT  ESTIMATION  PROCEDURES 


Marilyn  Wingersky 
Bruce  A*  Kaplan 
Albert  E.  Beaton 

Educational  Testing  Service 


In  its  proposal  for  the  NAEP  grant  (1982) ,  ETS  outlined  how  it  would 
use  the  joint  maximum  likelihood  procedures  incorporated  in  the  LOGIST 
program  (see  Wingersky,  Barton,  &  Lord,  1982;  M.  S.  Wingersky,  1983;  M.  S. 
Wingersky,  1984)  to  estimate  reading  item  parameters  and  individual 
proficiencies.    This  method  requires  that  a  substantial  number  (25  or  more) 
of  exercises  be  administered  to  each  student  whose  proficiency  is  to  be 
estimated.    Within  the  time  available  between  receiving  the  grant  and 
beginning  field  operation,  the  reading  exercises,  which  were  prepared  by 
the  previous  grantee,  cculd  not  be  fitted  into  the  block  structure  of  the 
new  design  in  such  a  way  as  to  reach  the  numoer  of  exercises  needed.  The 
lack  of  sufficient  exercises  per  student  resulted  in  an  undue  number  of 
students  who  had  perfect  scores  or  who  scored  below  chance  level  and  thus 
could  not  be  assigned  finite  maximum  likelihood  estlmatps  of  their 
proficiencies.    Because  losing  these  students  would  bias  population 
estimates  made  from  the  remaining  data,  winsorized  estimates  of  the 
population  parameters  were  computed.  However,  we  then  discovered  a  new 
technology  that  would  provide  better  estimates,  and  thus  this  new  method  of 
parameter  estimation  was  used. 

The  purpose  of  this  section  is  to  show  the  steps  that  we  took  in 
fulfilling  the  ETS  commitment  to  use  joint  maximum  likelihood  procedures, 
the  winsorization  process,  and  the  resultant  effect  on  the  distributions  of 
reading  proficiency.  Chapter  10.3  will  describe  the  new  method  of 
estimation  that  was  actually  used  in  producing  the  results  that  were 
presented  in  NAEP  reports. 


10.2.1  Method 

The  joint  maximum  likelihood  estimation  procedures  incorporated  in  the 
LOGIST  program  are  most  appropriately  applied  to  data  sets  in  which  each 
student  responds  to  25  or  more  exercises.    Because  the  data  collected  in 
the  Year  15  reading  assessment  did  not  meet  the  recommended  minimum  of  25 
exercises  per  student,  an  alternative  two-step  estimation  procedure  was 
devised.    In  the  first  step  of  the  alternative  procedure,  the  LOGIST 
computer  program  was  used  to  fit  the  three-parameter  logistic  IRT  model  to 


285 


3nj 


a  sample  of  the  available  data.    The  sample  was  selected  to  maximize  the 
precision  of  the  estimated  item  parameters  while  minimizing  convergence 
problems.    In  the  second  step  of  the  procedure^  the  MLE-ABIL  program  was 
used  to  obtain  maximum  likelihood  ability  estimates  (MLEs)  for  all  students 
who  were  presented  at  least  seventeen  items.    These  steps  are  described  in 
the  following  paragraphs.    Additional  details  can  be  found  in  M.  S. 
Wingersky  (1986). 

Both  the  LOGIST  program  and  the  MLE-ABIL  program  require  an  input  data 
matrix  consisting  of  observed  item  responses  which  have  been  coded  as 
right,  wrong,  omitted  or  "not  reached."    The  difference  between  an  omitted 
response  and  a  "not  reached"  response  is  described  in  Section  10.1.3.1.2. 
In  brief y  unanswered  items  which  occur  prior  to  the  last  valid  response 
within  a  block  are  coded  as  omits.    Unanswered  items  which  occur  subsequent 
to  the  last  valid  response  in  a  block  are  coded  as  "not  reached."    In  the 
Year  15  assessment ,  items  marked  "I  don't  know"  were  also  coded  as  omits. 

Both  the  LOGIST  program  and  the  MLE-ABIL  program  treat  "not  reached" 
items  as  if  they  had  never  been  administered.    The  rationale  for  this 
treatment  rests  on  a  fundamental  property  of  IRT  models  which  states  that 
an  examinee's  ability  is  invariant  with  respect  to  the  items  which  are  used 
to  measure  that  ability.    In  the  context  of  the  NAEP  reading  assessment, 
this  means  that  except  for  sampling  fluctuations,  an  individual  examinee's 
estimated  reading  proficiency  value  will  be  the  same  regardless  of  the 
particular  subset  of  items  to  which  the  examinee  has  chosen  to  respond.  Of 
course,  reasonable  numbers  of  responses  are  required  to  obtain  precise 
parameter  estimates.    In  the  Year  15  calibration,  a  cutoff  value  of 
seventeen  items  per  examinee  was  established.    In  the  first  step  of  the 
calibration,  this  cutoff  was  applied  to  the  number  of  items  reached. 
Omitted  responses  were  included  in  the  count  of  items  reached.    In  the 
second  step  of  the  calibration,  the  same  cutoff  value  of  seventeen  items 
was  applied  to  the  number  of  items  presented.    (The  MLE-ABIL  program  can 
accept  a  slightly  less  stringent  data  input  requirement  because  It  is  only 
estimating  abilities;  item  parameters  are  fixed  rather  than  estimated.) 

Many  applications  of  IRT  allow  for  the  fact  that  some  examinees  will 
respond  correctly  to  an  items  by  guessing.    It  is  typically  assumed  that  if 
an  examinee  elects  to  guess,  the  probability  that  he  or  she  will  guess 
correctly  can  be  approximated  by  the  reciprocal  of  the  number  of  valid 
response  alternatives.    Both  LOGIST  and  MLE-ABIL  incorporate  this 
assumption  by  maximizing  a  likelihood  function  which  has  been  modified  to 
allow  partial  credit  for  omitted  responses.    In  effect,  omitted  responses 
are  treated  as  fractionally  correct,  at  a  proportion  equal  to  the 
reciprocal  of  the  number  of  valid  response  alternatives.    This  modification 
is  described  in  detail  by  Lord  (1974). 


10.2.2    Item  Parameter  Calibration 

Initially  the  calibrations  were  done  separately  by  grade/age  to  see  how 
similar  the  parameter  estimates  were  for  the  items  that  were  common  across 
the  grade/ages.    If  the  estimates  were  similar  enough,  all  of  the 


286 


304 


grade/ages  could  be  calibrated  together  giving  better  parameter  estimates 
for  the  common  items  and  a  better  linking  between  the  ages  than  if  the  ages 
were  calibrated  separately  and  linked  with  some  standard  linking  procedure. 

The  first  grade/age  to  be  analyzed  was  Grade  8/Age  13,  the  middle 
grade/age  in  proficiency.    Included  in  the  calibration  run  were  examinees 
who  took  two  or  more  of  blocks  H,  J,  K,  M,  N,  0,  P,  Q,  R,  and  U  and  reached 
at  least  seventeen  items  regardless  of  how  many  items  they  omitted. 
Excluded  from  the  calibration  run  were  examinees  who  took  blocks  L,  V,  W, 
and  Y,  which  had  fewer  than  seven  reading  items.    Although  it  would  have 
been  possible  to  calibrate  these  items  when  given  with  other  blocks,  final 
proficiencies  estimated  for  examinees  who  took  only  these  blocks  would  be 
poorly  estimated  because  of  the  small  number  of  items.    Block  X  was  also 
not  included,  even  though  it  had  eight  reading  items,  because  six  of  the 
Items  were  puns,  the  only  puns  in  the  entire  item  collection.  Examinees 
who  had  zero  or  perfect  scores  were  excluded.    Of  the  10,255  examinees,  490 
were  removed  because  they  reached  fewer  than  seventeen  items.    There  were 
113  items  and  9,765  examinees  in  the  calibration  run. 

The  same  criteria  used  to  determine  which  Grade  8/Age  13  examinees  to 
include  in  the  calibration  were  used  for  Grade  4/Age  9.    Blocks  included 
were  H,  J,  K,  L,  M,  N,  0,  P,  Q,  R,  U,  and  V.    Of  the  13,297  examinees, 
1,786  who  reached  fewer  than  seventeen  items  were  removed.    There  were  127 
items  and  12,141  examinees  in  the  calibration  run.    The  Grade  4/Age  9  item 
parameter  estimates  were  then  transformed  so  that  they  would  be  on  the  same 
proficiency  scale  as  the  Grade  8/Age  13  item  parameter  estimates.  The 
transformation  program  used,  TBLT,  computes  the  linear  transformation  that 
minimizes  the  squared  difference  between  the  two  test  characteristic  curves 
computed  for  the  common  items  (Stocking  &  Lord,  1983). 

For  the  Grade  11/Age  17  calibration,  it  was  necessary  to  include  blocks 
with  as  few  as  five  items.    Otherwise,  there  would  be  too  few  examinees  per 
Item  to  calibrate  any  items.    Thus  blocks  L  and  Y  with  only  six  items  and 
block  J  with  only  five  items  were  included.    The  blocks  used  were  H,  J,  K, 
L,  M,  N,  0  ,P,  Q,  R,  U  and  Y.  Examinees  who  reached  fewer  than  seventeen 
items  were  excluded.    Booklets  2,  19,  27,  30,  and  57  were  excluded  because 
they  contained  fewer  than  seventeen  reading  items,  even  though  they 
contained  two  of  the  above  reading  blocks.    Of  the  12,011  examinees,  1,314 
were  removed  because  they  reached  fewer  than  seventeen  items.    There  were 
113  items  and  10,697  examinees  in  the  calibration  run.    The  item  parameters 
vere  then  transformed  using  TBLT  to  the  proficiency  scale  of  Grade  8/Age 

The  parameter  estimates  for  the  items  that  were  common  across  the 
different  grade/ages  were  consistent  enough  to  warrant  calibrating  all  of 
the  grade/ages  together.    The  same  examinees  and  the  same  items  for  each 
grade/age  were  used  as  were  used  in  the  single  calibrations.  Although 
blocks  L  and  Y  were  calibrated  for  Grade  11/Age  17  in  the  single  run,  they 
were  not  calibrated  for  Grade  8/Age  13.    Consequently,  the  responses  of 
Grade  8/Age  13  examinees  to  the  items  in  blocks  L  and  Y  were  coded  as  "not 
reached"  for  this  run.    Although  the  dataset  of  item  responses  contained 
252  Items,  only  229  items  were  calibrated.    Items  in  blocks  W  and  X  were 


287 


not  used  and  were  coded  "Not  reached".  Items  in  block  V  for  Grade  8/Age  13 
and  Grade  11/Age  17  were  not  used  and  were  coded  "Not  reached".    Of  the 
initial  36,193  examinees,  3,590  were  removed  because  they  reached  fewer 
than  seventeen  items.    There  were  229  items  and  32,603  examinees  in  the 
calibration  run.    Again,  only  examinees  who  reached  seventeen  items  were 
included.    Examinees  were  included  regardless  of  the  number  of  items  coded 
as  Omits. 

Item  proficiency  regression  plots,  where  the  observed  proportion 
correct  were  plotted  separately  for  the  three  groups,  were  examined  to  see 
if  the  different  grade/age  groups  were  responding  differently  on  the  common 
items.    Although  there  were  several  items  for  which  individual  grade/age 
groups  responded  differently,  it  was  decided  to  use  the  results  with  all 
grade/ages  calibrated  together. 


10.2.3    Maximum  Likelihood  Estimates  of  Proficiency 

A  student's  maximum  likelihood  estimate  (MLE)  of  proficiency  on  a  given 
scale  indicates  the  value  that  is  most  likely  to  have  produced  the 
responses  that  he  or  she  actually  made  given  the  estimated  item  parameters 
for  the  items  taken.    A  maximum  likelihood  estimate  of  9  cannot  be  computed 
for  examinees  with  zero  scores,  perfect  scores  or  a  small  number  of  other 
response  patterns  where  the  MLE  of  9  attempts  to  go  to  minus  infinity. 

Proficiency  estimates  were  computed  for  all  examinees  for  all 
grade/ages  who  took  booklets  that  had  seventeen  or  more  reading  items 
calibrated  for  that  particular  grade/age.    Item  N001801,  which  had  a  flat 
item  response  function,  was  not  used.    The  proficiency  range  was  bounded  by 
-7  and  5.    The  "direct"  method  refined  by  several  Newton  iterations  was 
used.    The  direct  method  consists  of  computing  the  likelihood  function  for 
equally-spaced  proficiencies  between  -7  and  5  and  selecting  the  proficiency 
corresponding  to  the  maximum  of  these  values  of  the  likelihood  function. 
This  proficiency  is  then  used  as  a  starting  value  for  Newton's  method. 

The  standard  error  of  estimation  of  the  maximum  likelihood  estimate  of 
proficiency  was  computed  for  each  MLE.    Tne  standard  error  of  estimation 
indicates  the  precision  of  measurement  of  the  maximum  likelihood  estimate. 

A  score  on  the  xi  scale  was  also  computed  for  each  examinee.    This  is  a 
nonlinear  transformation  of  the  9  scale  to  a  number-right  true  score  scale. 
The  xi  scale  refers  to  the  number  of  correct  responses  that  might  be 
expected  if  the  entire  pool  of  228  reading  exercises  included  in  the  IRT 
scaling  were  administered  as  a  single  test.    This  scale  runs  from  48.7 
("chance"  level)  to  228  (perfect  score). 

The  proficiency  estimate  was  also  converted  to  a  reading  proficiency 
(RP)  score.      This  refers  to  the  expected  number  of  correct  responses  that 
the  examinee  would  get  on  a  hypothetical  500-item  test.    The  IRT  parameters 
for  the  items  on  this  test  have  equal  item  discriminations,  at  the  average 
level  of  the  actual  NAEP  items  on  the  scale;  equal  lower  asymptotes  of 
zero;  and  equally-spaced  difficulty  parameters  ranging  from  -5  to  +4.98  on 


288 


306 


the  proficiency  scale.    The  relationship  between  this  scale  and  the  9  scale 
is  virtually  linear  for  proficiencies  from  -4  to  +4  and  is  approximated  by 
the  relationship 


Proficiency  =  50  0  +  250.5 


The  standard  errors  of  both  xi  and  reading  proficiency  scores  were 
computed.    MLE  proficiency  estimates  on  the  xi  scale  and  the  reading 
proficiency  scale  were  computed  for  the  same  examinees  for  which 
proficiency  estimates  on  the  9  scale  were  computed. 

The  following  bounds  were  placed  on  the  values  for  the  various  scales. 
For  examinees  whose  proficiencies  could  be  estimated,  the  scale  was  limited 
to  a  range  of  -7  to  5.    The  standard  error  of  9  was  limited  to  a  maximum  of 
998.    The  xi  scale  was  limited  to  48.7  to  228.    The  standard  error  of  xi 
was  limited  to  a  maximum  of  228.    The  proficiency  scale  was  limited  to  1 
and  449  (the  score  to  which  a  9  of  4  converts  on  the  proficiency  scale). 
The  maximum  standard  error  on  the  proficiency  scale  was  500. 

Table  10.2(1)  indicates  the  arbitrary  values  flagging  examinees  for 
whom  a  maximum  likelihood  estimate  could  not  be  obtained. 

Because  of  the  short  length  of  the  tests  and  the  wide  range  of 
abilities  spanning  Grades  4  to  11,  there  were  many  rero  and  perfect  scores. 
These  scores  at  the  extremes  distorted  statistics  computed  for  different 
subgroups  but  they  could  not  be  dropped  without  destroying  the 
representativeness  of  the  sample  to  the  population.    Consequently  a 
procedure  that  is  a  type  of  Winsorizing  (Huber,  1981)  was  devised,  to  bring 
these  extreme  values  closer  to  the  rest  of  the  values  in  the  distribution. 

This  was  done  by  computing  the  "inner  fences"  (Tukey,  1977,  p.  44)  as 
boundaries  to  the  distribution  and  setting  all  values  outside  of  the 
boundaries  at  the  appropriate  boundary.    The  boundaries  were  computed  by 
first  computing  the  hingespread,  H,  which  is  the  difference  between  the 
25th  percentile  (Q^)  and  the  75th  percentile  (Q3),  then  computing  the 
minimum  and  maximum  boundary  (or  inner  fences)  as  follows: 

H  =  Q3  - 

minimum  =  -  1.5  H 
maximum  =       +  1.5  H 


All  values  below  the  minimum  were  set  to  the  minimum;  all  values  above  the 
maximum  were  set  to  the  maximum. 

Table  10.2(2)  shows  the  minimum  and  maximum  scores  and  the  number  of 
values  changed  for  each  grade/age  for  the  Year  15  BIB  spiral  data. 


289 


ERLC 


3l»7 


Table  10.2(1) 


Values  Assigned  to  Examinees  Whose  Maximum  Likelihood 
Estimate  Could  Not  Be  Computed 


Flag  for                   Flag  for  Flag  for 

standard                    standard  standard 

error                        error  RP  error 

(9)  xi          (xl)  score  (RP) 


Zero  score  -100.  999.  48.7  999.  1.  999. 
MLE  below 

lower  limit  -100.  999.  48.7  999.  1.  999. 

Perfect  score  100.  999.  228.0  999.  449.  999. 


Table  10.2(2) 

Minimum  and  Maximum  Scores 
and  Number  of  Values  Changed 
by  Grade/Age 


Percent  Percent 
Grade/  Moved  to  Moved  to 

Age       Minimum  Minimum  Median                        Maximum  Maximum 

4/9          6.4  82  177        212.5  241            336  1.0% 

8/13        4.5  161  233        257.5  281           353  1.2% 

11/17        2.9  189  262        287.0  311            385  4.0% 


290 


The  groupings  of  extreme  scores,  even  after  this  modification,  produce 
the  dis**ributions  shown  as  Figure  10.2-1.    The  proportions  of  a  grade/age 
population  accounted  for  in  extrenie  groups  depends  in  part  upon  features 
unrelated  to  the  true  distributions,  such  as  the  numbers  and  difficulties 
of  exercises  administered  to  pupils.    Doubts  thus  arise  about  using  these 
distributions  of  estimates  to  approximate  characteristics  of  the 
distribution  of  underlying  proficiencies,  where  no  such  anomalies  are 
anticipated.    Methods  intended  to  estimate  the  underlying  distribution 
directly,  bypassing  the  intermediate  ar-'i  problematic  step  of  estimating 
scores  for  individual  examinees,  are  described  in  Chapter  10.3. 


291 


Figure  10,2-1 
Distributions  of  Adjusted  Proficiency  Scale  Scores 


Grade  4/ Age  9 


z 

u 

s 

s 

CO 
lU 


30-1 
15- 
0- 


50  100  150  200         250  300 

SCALED  MLE 


350 


400  450 


ui 

s 

2 

M 
lU 

ae 
o 


Grade  8/ Age  13 


30- 


15- 


T" 

50 


100 


I  r 

150  200  250  300  350  400  450 

SCALED  MLE 


S      "in  — 


Grade  11 /Age  ./ 


lU 

o 


M 
lU 

O 


400  450 


FRIC 


3i0 


Chapter  10.3 
MARGINAL  ESTIMATION  PROCEDURES 


Robert  J.  Mislevy 
Kathleen  M.  Sheehan 

Educational  Testing  Service 


Item  response  theory  (IRT)  offers  NAEP  the  advantages  of  efficiency  in 
the  estimation  of  population  characteristics,  common-scale  measurement 
across  forms  and  over  time,  and  results  that  are  interpretable  in  terms  of 
expected  behavior  on  specific  tasks.    The  experiences  described  in  the 
preceding  chapter  proved,  however,  that  these  advantages  could  not  be 
attained  with  standard  IRT  measurement  procedures.    The  NAEP  data  are 
simply  too  sparse  at  the  ^evel  of  the  individual  examinee  to  support 
precise  individual  point  estimates—estimates  which  could  be  used  in  turn 
to  estimate  parameters  for  cognitive  items,  population  characteristics,  and 
relationships  between  performance  and  background  variables. 

Ho*/ever,  it  is  exactly  these  latter  population-level  parameters,  rather 
than  parameters  for  specific  examinees,  that  are  of  interest  in  NAEP.  NAEP 
objectives  can  therefore  be  attained  with  methodologies  that  produce 
population  parameters  directly,  without  the  intermediary  computation  of 
parameters  for  individuals.    To  this  end,  marginal  estimation  techniques 
for  latent  variables  (e.g..  Bock  &  Aitkin,  1981;  Mislevy,  1985a)  were 
extended  to  the  setting  of  survey  samples  by  means  of  Rubin's  (1977,  1978) 
multiple  imputation  techniques  for  missing  data.    A  technical  description 
of  the  resulting  procedure  is  given  in  Mislevy  (1985b).    The  purposes  of 
this  chapter  are  to  (1)  review  the  procedures  in  general  terms  and  (2) 
provide  details  of  their  implementation  in  the  Year  15  NAEP  reading 
analyses.    The  steps  in  those  analyses,  which  will  be  discussed  in  turn 
after  an  overview  of  the  procedures,  are  as  follows: 


Year  15  BIB  Data 

Estimation  of  item  parameters 
Estimation  of  conditional  effects 
Generation  of  "plausible  values" 

Year  15  Pace  Data 

Estimation  of  conditional  effects 
Generation  of  "plausible  values" 
Equating  to  BIB  scale 


293 


(The  estimation  of  item  parameters  under  paced  administration 
conditions  was  carried  out  in  conjunction  with  the  analysis  of  data  from 
previous  NAEP  reading  assessments,  and  will  be  described  in  Chapter  10.4, 
Estimation  of  Trends.) 


10.3.1    The  General  Model 

The  object  of  inference  in  a  sample  survey  is  a  (possibly 
vector-valued)  function  T  of  the  values  of  survey  variables  in  all  N 
members  of  the  population.    This  value  is  estimated  by  a  function  t  of  the 
values  obtained  from  a  sample  of  size  n.    The  precision  oi  t  as  an  estimate 
of  T  is  indicated  by  another  function  of  the  sampled  values,  namely  the 
estimated  variance  var(t)i  which  approximates  the  true  variance  of  t,  or 
VAR(t). 

To  enable  discussion,  ve  shall  denote  the  (possibly  vector-valued) 
proficiency  of  examinee  i  by  6^ ,  and  denote  by  6  the  values  in  the  entire 
realized  sample.    Let  y^  and  y  denote  similarly^'def ined  values  of 
background  and  attitude  variables  for  examinee  i  and  for  the  entire 
realized  sample  respectively.    If  6  and  Y  represent  correspondingly 
defined  values  in  the  population  as  a  whole,  then  T  is  a  function  of 
e  and  Y,  while  t  and  var(t)  are  functions  of  6  and 

The  formulations  above  assume  that  §  is  observed  without  error.  This 
is  not  the  case  in  NAEP  under  the  assumed  IRT  model.    Instead,  observations 
are  of  the  form  x^^,  the  response  of  examinee  i  to  cognitive  item  j,  for 
j«l,...,m  .    These  responses  are  assumed  to  be  governed  by  the  IRT  model, 
under  which  the  probability  of  a  given  response  depends  on  the  (unobserved) 
proficiency  of  the  examinee  and  the  (unknown)  parameters  0.  of  the  item 
through  the  IRT  function  p(x^^  «  l|e^,P^). 

A  latent  variable  like  6  in  an  IRT  model  can  be  thought  of  as  a 
variable  whose  value  is  missing  for  all  examinees.    Under  Rubin's  (1977) 
approach  to  missing  values  in  survey  samples,  a  reasonable  estimate  of  T  is 
obtained  by  computing  the  expectation  of  t,  given  values  of  variables  that 
were  not  missing,  i.e., 

t*(x,x)  -  |(t(e,x)|x,x) 

Equation  (1)  may  be  thought  of  as  an  average  of  t(6,y)  computed 

over  all  possible  values  of  the  unobserved  variable  6, "with  each  weighted 

in  proportion  to  its  consonance  with  the  observed  values  x  and 

Furthermore,  the  variance  of  t*  can  be  approximated  by 

var(t*)  =  E(var(t(e,x))|x,x)  +  Var( t(e,x) Ix,^)  /o) 


294 


312 


(Rertzog  «  Rubin,  1983).    This  variance  estimator  is  the  sum  of  two 
cowponenis:    the  expected  value  of  the  variance  of  t,  which  indicates 
uncertainty  due  to  sampling  from  the  population,  and  the  variance  of 
t(0>jr)  giv®"  i  and  jr,  which  indicates  uncertainty  due  to  not  knowing  the  9 
values  of  the  examinees  in  the  realized  sample. 

The  evaluation  of  Equations  (1)  and  (2)  requires  the  conditional 
distribution  of  the  late  it  variables  9  given  the  obser:ved  variables 
X  and  jr,  or  p(9|x,jr).    Standard  rules'^of  the  calculus  of  probabilities 
allow  this  distribution  to  be  expressed  as  a  constant  times  the  product  of 
two  terms,  or 


P<§li»X)        P(5l§»X)  P(§lx)    •  (3) 


The  first  term  in  the  right  hand  side  of  this  expression  is  given  by 
the  item  response  model.    By  conditional  independence. 


where  again  fi    is  the  unknown  and  possibly  vector-valued  parameter  for  item 
j.    If  multiple  scales  pertaining  to  mutually-exclusive  subsets  of  items 
•re  entertained,  this  term  may  be  written  as 


where  k  indexes  .scales  and  fi.^  is  the  parameter  for  item  j  within  scale  k. 

Assuming  independence  over  examinees,  the  second  term  in  Equation 
(3)  can  be  written  as 


p(B\l)  -  p  p(9^  |y^,a)  , 

the  product  over  examinees  of  conditional  distributions  of  the  values  of 
their  latent  variables  given  their  observed  responses  to  background  and 
attitude  items.    Here  a  represents  the  (unknown)  parameters  of  these 
distributions.    Suppose,  for  example,  normal  distributions  are  assumed  for 
conditional  distributions  whose  means  are  determined  by  y.    In  this  case 
a  might  consist  of  a  common  conditional  variance  and  regre'-rion  parameters 
that  yield  the  conditional  means. 

The  unknown  parameters  p  for  item  parameters  and  a  for  conditional 
distributions  of  9  given  background  variables  y  can  be  estimated  precisely 


295 


3lA 


from  large  samples  of  examinees  even  when  indiv  dual  examinees'  parameters 
llZot.    This  may  be  accomplished  by  so-called  "marginal"  estimation 
p^oceSures  that/in  statistical  terms,  treat  examinee  parameters  as  random 
father  than  fixed  effects.    Both  sets  of  parameters  may  be  estimated 
simultaneously  by  the  method  of  maximum  likelihood,  for  example,  by 
max?miz?ng  the  following  marginal  likelihood  function  with  respect 
to  a  and  §  : 

L(2,g|x,x)  =  f  I  p(Xi|e.yi.6)  p(e|yi.«)  de  . 

An  algorithm  to  accomplish  this  task  is  given  in  Mislevy  (1986a).  This 
algorithm  was  applied  to  the  Year  15  reading  data  in  two  steps.    First,  the 
vector  of  items  parameters  g  was  estimated  with  respect  to  an  ""restricted 
e  distribution,    second,  the  conditional  effects  a  were  "^^J^J^f,  Yj^^^  g 
fixed  at  its  maximizing  value.    The  first  step  was  accomplished  "^ijg  JILOG 
(Mislevy  &  Bock,  1982).    The  second  step  was  accomplished  using  M-OROUP 
(Sheehan,  1985). 

The  parameter  estimates  g  and  a  were  then  used  to  aPP'^f  ^'"^^^^^5^, 
cond!"ioSal  distribution  of  9  given  x  and  y,  for  each  examinee,  as  follows. 

pcelx^.y^)  =  p(Q|^'yi'2=2'^6)  (5) 

where  g  =  estimated  item  parameters  obtained  from  step  1,  and 
a  =  estimated  conditional  effects  obtained  from  step  2. 


10.3.1.1    "Plausible  Values" 

Two  considerations  merit  attention  at  this  point.    First,  even  when 
point  estimates  of  a  and  g  are  used  to  approximate  P(e|'^'y)  i"/^^^-^;"?" 
described  above,  neither  closed-form  expressions  nor  convenient  analytic 
approximations  are  generally  available;  instead,  ""-""^^^l  ^PP'^^^^^^Ji""' 
mSst  be  employed.    Second,  it  is  not  possible  in  a  survey  wth  as  many 
background  variables  as  the  NAEP  survey  to  model  in  detail  the  tuii 
condiHonal  distribution  p(e|y);  only  selected  ^-^•^g'^^""^/^,"^""  '^^^H 
included,  and  even  then,  a  simplified  functional  form  must  be  used.  These 
considerations  are  discussed,  in  turn,  below. 

The  numerical  approximations  employed  by  M-GROUP  can  characterized 
bv  (a)  the  representation  of  smooth  functions  such  as  p(xje,y.),  P^ajy^;, 
and  D(e|x  ,y,)  as  histograms  over  points  that  span  the  range  of  9,  and 
Tb)  Monte'ca^lo  evaluation  of  required  integrals  via  repeated  samples  from 
n(e|x,y).    Each  histogram  is  defi'ied  over  a  predetermined  grid  of 
points  A.    p(e|x,,y,)  is  then  app-oximated,  at  each  point  A^  in  A,  as 


296 


P(e=  Ajx.,y.)  =  C.  p(x.|A^,g)  p(Ajy.,a)  , 


where       is  a  normalizing  constant.    A  value  9^  may  then  be  drawn  at  random 
from  the  histogram  in  two  steps.    First,  a  bar  is  selected  at  random  from 
the  histogram  in  accordance  with  the  probabilities  given  by  Equation  (6). 
Second,  a  '^Ixxe  is  selected  at  random  from  that  interval.    Carrying  out 
these  stept^  for  each  examinee  in  the  sample  yields  a  pseudo-da taset,  with 
each  examinee  represenced  by  a  "plausible  value"  of  what  his  or  her 
unobservable  9  might  be,  given  the  observed  values  x.  and  y^ . 

This  construction  guarantees  that  t(9,y)  and  var(t(9,y))  have 
expectations  equal  to  E( t( 9,j^)  |x, j^)  and  E(var(t(9,x))  jxr^),  the  values 

targeted  in  Equations  (1)  and  (2).    Let  9    represent  the  vector  of 
plausible  values  comprising  the  k     of  K^pseudo-datasets.    Under  the 
assumption  that  p(xj9)  and  p(9|y)  have  been  correctly  specified,  a 
consistent  estimate  of  T  is  given  by 


t*  =         Z  t(9^,x)  . 


Its  variance,  as  an  estimate  of  T,  may  be  approximated  as 


var(t*)  =  K'^  Z  var(  t(9^ ,  j^))  +  Z  [  t*  -  t(§^,jr)i^  . 

This  variance  estimator  is  again  the  sum  of  two  terms.    The  first, 
representing  sampling  variability,  may  take  the  form  of  jackknifing  t  with 
9  treated  as  an  observed  variable;  alternatively,  a  simple  random  sampling 
variance,  again  evaluated  on  G,  may  be  boosted  by  a  design  effect.  The 
second  term  again  reflects  uncertainty  due  to  the  latency  of  9  . 

10.3.1.2    Effects  of  Specification  Errors  on  Plausible  Values 

It  is  implicit  in  Equation  (3)  that  consistent  estimation  of  statistics 
involving  background  variables  requires  that  the  joint  density  of  those 
variables  with  the  unobservable  variable  be  specified  and  its  parameters 
estimated.    It  is  obviously  not  possible  to  compute  a  joint  distribution 
for  all  of  the  hundreds  of  NAEP  variables;  the  procedures  were  employed  for 
only  the  key  NAEP  reporting  variables  for  trends: 

Age 

*  Grade 

*  At,  above,  or  below  modal  age  for  grade 


297 


310 


*       At,  above,  or  below  modal  grade  for  age 


*  Sex 

*  Ethnicity  (Hispanic,  black,  and  other) 

*  Size  and  type  of  community  (high  metro,  low  metro,  and 
other) 

*  Parental  education  (higher  of  mother  or  father:  less 
than  high  school,  high  school,  or  post  high  school) 

*  Region  of  the  country 

We  shall  refer  to  these  as  the  "conditioning  variables,"  Moreover, 
only  a  main  effects  mode"*,,  assuming  normally  distributed  and  homoscedastic 
residual  terms,  could  be  employed  due  to  computational  limitations.  The 
distribution  p(6|y)  Is  thus  approximated  by  p*(6|y),  where  p*  Incorporates 
only  main  effects  of  the  above-mentioned  conditioning  variables. 

This  simplification  can  be  considered  a  "primary"  specification  error, 
"primary"  because  it  enters  into  the  generation  of  plausible  values.    It  is 
distinguished  from  a  "secondary"  specification  error,  which  would  refer  to, 
say,  omitting  variables  from  a  regression  equation  when  analyzing  a  given 
set  of  plausible  values.    The  consequences  of  primary  specification  error 
for  subsequent  analyses  can  be  expressed  as  follows: 


Bias  =  E 


t(e,Y)  [p*(e|x,Y)  -  p(y|x,Y)]  de 


(7) 


where  expectation  is  caken  over  X  for  fixed  6.    Of  particular  interest  in 
NAEP  are  biases  corresponding  to'nonconditioned  variables.    That  is, 
Y=(Y  ,Y  ),  plausible  values  are  generated  under  p*(e|Y)  =  p(e|Y^),  and  a 
statistic  t  that  involves  9  and       is  calculated  under  a  secondary 
analysis. 

Unfortunately,  simple  expressions  for  (7)  are  not  readily  available  in 
full  generality  for  all  the  statistics  that  could  be  computed  from  the  NAEP 
database.    Section  10.3.5  instead  presents  primary  specification  biases  in 
a  simplified  case  for  which  explicit  expressions  can  be  derived. 
Specifically,  we  shall  assume  a  variant  of  the  classical  "true-score"  model 
of  test  theory,  under  which  the  variable  X=e  +  E  is  observed  (along  with 
Y),  with  E  independent  and  identically  distributed  normal  over  all 
respondents.    Note  that  no  such  X  can  be  calculated  for  9  in  the  NAEP 
setting  of  an  IRT  model  with  only  a  few  responses  per  person.  This 
simplified  setting  does,  however,  provide  both  intuition  and  approximate 
expressions  for  the  more  complex  relationships  between  latent  and  observed 


298 


variables  that  are  embedded  in  IRT.  Details  are  given  in  Mislevy  (in 
progress). 


10.3.2    Estimation  of  Item  Parameters 

The  LOGIST  computer  program  (Wingersky,  Barton,  &  Lord,  1982)  had 
originally  been  used  to  obtain  estimated  item  parameters  for  229  of  the 
reading  items  which  were  ad.T.inistered  in  the  Year  15  assessment.  However, 
because  the  LOGIST  results  proved  to  be  unsatisfactory,  item  parameters 
were  re-estimated  using  the  BILOG  computer  program  (Mislevy  &  Bock,  1982). 
In  both  calibrations  a  three-parameter  logistic  IRT  model  was  assumed. 

Like  the  LOGIST  program,  BILOG  requires  an  input  data  matrix  consisting 
of  observed  item  responses  which  Jjave  been  coded  as  right,  wrong,  omitted 
or  "not  reached."    The  coding  conventions  developed  for  the  LOGIST 
calibration  were  used,  without  modification,  for  the  BILOG  calibration. 
(These  coding  conventions  are  described  in  Section  10.2.1.)    The  BILOG 
calibration  also  mirrored  the  LOGIST  calibration  in  its  treatment  of 
omitted  and  "not  reached"  responses.    For  the  reasons  presented  in  Section 
10.2.1,  responses  coded  as  "not  reached"  were  excluded  and  responses  coded 
as  "omitted"  were  treated  as  fractionally  correct,  at  a  proportion  equal  to 
the  number  of  valid  response  alternatives. 

The  major  difference  between  the  LOGIST  calibration  and  the  BILOG 
calibration  is  that  the  joint  estimation  procedures  employed  by  LOGIST 
require  that  a  point  estimate  of  proficiency  be  computed  for  each  subject, 
whereas  the  marginal  estimation  procedures  employed  by  BILOG  rely  on  Bayes 
Theorem  to  obtain  proficiency  distributions  for  all  subjects  without 
computing  individual  proficiency  point  estimates  for  any  subject. 

In  both  programs,  estimation  proceeds  in  cycles,  with  provisional 
proficiency  estimates  (or  distributions)  used  to  obtain  improved  item 
parameter  estimates  in  one  cycle  and  provisional  item  parameter  estimates 
used  to  obtain  improved  proficiency  estimates  (or  distributions)  in  the 
next  cycle. 

A  practical  result  of  the  differences  noted  between  the  estimation 
procedures  employed  by  BILOG  and  LOGIST  is  that  BILOG  does  not  require  that 
each  examinee  respond  to  a  minimum  number  of  items.    Instead,  BILOG's  data 
input  requirements  are  formulated  in  terms  of  the  number  of  examinees 
responding  to  each  item.    In  particular,  it  is  recommended  that  each 
calibration  be  performed  on  a  data  set  providing  a  minimum  of  1,000 
responses  for  each  item. 

Because  IRT  parameters  are  theoretically  sample-free,  and  IRT 
calibration  programs  are  generally  expensive  to  run,  many  IRT  models  are 
calibrated  from  a  sample  of  the  available  data.    The  calibration  sample  is 
typically  selected  to  meet,  but  not  exceed,  the  input  data  requirements  of 
the  particular  calibration  program  being  used.    When  the  calibration  sample 
is  randomly  selected  from  the  available  data,  resulting  parameter  estimates 
are  unbiased  estimates  of  those  that  would  have  been  obtained  if  all  of  the 


299 


data  had  been  used,  as  long  as  all  of  the  data  input  requirements  have  been 
satisfied.    The  invariance  property  of  IRT  item  parameters  also  provides 
the  theoretical  justification  for  not  using  sampling  weights  during  the 
item  parameter  estimation  phase  of  an  IRT  calibration. 

Each  of  the  229  items  which  had  been  selected  for  use  in  the  LOGIST 
calibration  were  considered  for  inclusion  in  the  BILOG  calibration.  All 
but  one  were  eventually  included.    Item  20  (ETS  ID  #  N001201)  was  excluded 
because  it  had  exhibited  severe  convergence  problems.    The  calibration 
sample  was  selected  from  all  examinees  who  took  at  least  two  reading 
blocks,  except  for  excluded  blocks.    (Blocks  W  and  X  were  excluded  for 
Grade  A/Age  9.    Blocks  V,  W,  and  X  were  excluded  for  Grade  8/Age  13  and 
Grade  11/Age  17.)    This  BILOG  sampling  frame  differed  from  the  LOGIST 
sampling  frame  in  that  examinees  in  Grade  8/Age  13  who  took  blocks  L  or  Y 
were  not  excluded.    The  BILOG  and  LOGIST  sampling  frames  are  summarized  in 
Table  10.3(1). 

The  final  sample  consisted  of  10,286  examinees,  or  approximately 
one  fourth  of  the  available  subjects  in  each  grade/age.    This  sample 
provided  approximately  1,000  examinees  in  each  grade/age  for  each  item. 
However,  since  all  items  were  not  administered  to  all  grade/ages,  the  total 
number  of  examinees  responding  to  each  item  ranged  from  1,000  to  3,000 
(approximately).    As  noted  above,  sampling  weights  were  not  employed  in  the 
item  calibration. 

Several  modifications  were  made  to  the  BILOG  computer  program  to 
customize  it  for  use  with  NAEP  data.    One  modification  provided  an  option 
for  analyzing  items  with  variable  numbers  of  response  alternatives.  A 
second  modification  provided  a  capability  for  distinguishing  among  distinct 
subpopulations  of  examinees  in  the  calibration  sample.    This  capability  was 
required  to  avoid  the  gratuitous  assumption  that  examinees  in  different 
grade/ages  were  exchangeable  members  of  a  common  population.    A  final 
modification  provided  for  the  creation  of  an  output  file  containing  item 
fit  statistics  for  subpopulations  of  examinees. 

Although  the  three-parameter  model  has  been  shown  to  be  well  suited  for 
analyzing  NAEP  data,  it  does  have  some  unfortunate  characteristics.  One 
of  these  is  a  tendency  to  produce  multi-collinearity  when  the  response  data 
includes  very  difficult  or  very  easy  items.    In  cases  of  multi- 
collinearity,  widely  varying  combinations  of  the  (a,b,c)  parameters  can 
produce  similar  response  curves  through  the  region  of  9  where  the 
calibration  sample  of  examinees  lies.    Without  constraints,  unstable  and 
unreasonable  (a,b,c)  triples  can  result.    BILOG  guards  against  these 
problems  by  supplying  Bayesian  priors  for  each  type  of  item  parameter,  with 
fixed  dispersions  and  with  locations  estimated  from  the  data.  Default 
priors  are  normal  for  b's,  with  a  standard  deviation  of  2;    log-normal  for 
a's  with  a  standard  deviation  of  1  for  log  a;  and  beta  for  c's,  with  the 
weight  of  20  observed  responses  from  low-ability  examinees.    These  default 
priors  proved  to  be  unsatisfactory  for  the  multiple-choice  items  in  the 
Reading  assessment,  primarily  because  of  the  presence  of  a  large  number  of 
very  easy  items.    In  particular,  estimated  c  values  tended  to  be  higher 
than  expected  (when  compared  with  the  reciprocals  of  the  numbers  of 


300 


Table  10.3(1) 

Blocks  Selected  for  Scaling  the  Year  15  Reading  Data 
Block       Grade  4/Age  9    Grade  8/Age  13    Grade  11/Age  17 


H 

X 

X 

X 

J 

X 

X 

X 

K 

X 

X 

X 

L 

X 

0 

X 

M 

X 

X 

X 

N 

X 

X 

X 

0 

X 

X 

X 

P 

X 

X 

X 

Q 

X 

X 

X 

R 

X 

X 

X 

U 

X 

X 

X 

V 

X 

N 

N 

V 

N 

N 

N 

X 

N 

N 

N 

Y 

N 

0 

X 

X  =  Included  in  both  the  LOGIST  and  BILOG  item  calibrations. 
0  =  Included  in  the  BILOG  item  calibration  only. 
N  =  Not  included  in  either  calibration. 


301 


313 


response  alternatives)  and  estimated  a's  were  lover  than  expected  (when 
compared  with  a  values  from  free-response  items).    To  force  the  program  to 
produce  "more  reasonable"  estimates,  the  prior  distributions  were  modified 
in  the  following  manner: 

(1)  The  prior  standard  deviation  of  log  a  was  changed  from  1.0 
to  0.5,  and 

(2)  the  precision  of  the  beta  prior  on  asymptotes  was  increased 
from  the  weight  of  20  observations  to  the  weight  of  50 
observations. 

These  changes  resulted  in  item  parameter  estimates  that  were  reasonable 
in  appearance  and  fit  the  data  well.    The  resulting  item  parameter 
estimates  and  corresponding  standard  errors  are  provided  in  Appendix  B, 
Table  B~8.    Because  a  linear  indeterminacy  exists  with  respect  to  the 
values  of  9,  a,  and  b,  in  the  three-parameter  model,  the  parameter 
estimates  have  been  arbitrarily  scaled  so  that  the  distribution  of 
proficiency  in  the  calibration  sample  has  a  mean  of  zero  and  a  standard 
deviation  of  one. 

Item  fit  was  evaluated  by  inspecting  residuals  from  fitted  item 
response  curves.    A  typical  plot  is  shown  in  Figure  10.3-1.    The  smooth 
line  is  the  fitted  three-parameter  logistic  item  response  curve;  the  three 
plot  symbols  represent  the  expected  proportions  of  correct  responses  for 
examinees  in  each  grade/age  at  various  points  along  the  reading  proficiency 
scale.    These  expected  proportions  were  calculated  without  assuming  the 
three-parameter  functional  form.    (These  plots  were  produced  by  a  special 
modification  of  BILOG.    Each  is  based  on  pseudo-counts  of  attempts  and 
corrects  to  an  item  produced  by  an  additional  E-step  of  its  EM-algori thm. 
See  Bock  and  Aitkin,  1981,  for  details.)    The  size  of  each  symbol  is 
proportional  to  the  amount  of  information  available  in  the  calibration 
dataset  in  the  region  of  the  scale  where  the  symbol  is  plotted. 

Item  bias  was  evaluated  by  inspecting  residuals  for  examinee 
subpopulations  defined  by  sex  and  ethnicity.    (These  plots  were  produced  by 
a  special  modification  of  LOGIST.    Each  is  based  on  counts  of  attempts  and 
corrects  to  an  item  from  groups  of  examinees  with  similar  estimated 
abilities.)    A  typical  plot  is  shown  in  Figure  10.3-2.    In  this  figure,  the 
plot  symbols  distinguish  between  subpopulations  defined  by  sex.    Plots  such 
as  those  depicted  in  Figures  10.3-1  and  10.3-2    were  examined  for  all 
items.    Copies  of  these  plots  are  available  from  ETS  upon  request. 


10.3.3    Estimation  of  Conditional  Effects 

Conditional  distributions  of  reading  proficiency  given  background 
responses  were  estimated  separately  for  examinees  in  each  grade/age.  The 
number  of  background  variables  which  could  be  included  in  each 
wi thin-grade/ age  model  was  limited  by  the  availability  of  computing 
resources.  The  background  variables  selected  included  sex,  imputed 
ethnicity,  size  and  type  of  community  (STOC),  region,  and  parental 


302 


ERLC 


3^U 


Figure  10.3-1 
Diagnostic  Fit  Plot  for  Item  9  (ID=N001502) 


Grade  4/Age  9       O  =  Gfade  8/Age  13       X  =  Grade  11 /Age  17 


0 


-4  0  4 

Theta 

Figure  10.3-2 
Bias  Plot  for  Item  10  (ID=N001503) 

O  =  Male       A  =  Female        □  =  Total 


T  1  1  r 


-3  0  3 

Theta 

303 


education  (these  variables  are  defined  in  Section  12.1).    Differences  in 
reading  proficiency  resulting  from  grade  and/or  age  differences  within  a 
single  grade/age  were  taken  into  account  by  including  a  grade/age  variable 
in  each  model.    For  examinees  in  Grade  4/Age  9,  the  grade/age  variable  was 
defined  as  follows: 


Level  Description 

1  <9  years,  grade  =4 

2  =9  years,  grade  <4 

3  =9  years,  grade  =4 

4  =9  years,  grade  >4 

5  >9  years,  grade  =4 


Similar  variables  were  defined  for  examinees  in  Grade  8/Age  13  and 
Grade  xl/Age  17. 

A  main  effects  model  was  assumed  for  each  grade/age.    Mislevy's  GROUP 
computer  program  (1984a)  was  used  to  estimate  each  model.    In  this  program, 
examinees  are  grouped  into  a  number  of  distinct  cells  based  on  their 
responses  to  the  selected  background  variables.    Reading  proficiency,  , 
is  assumed  to  be  normally  distributed  with  a  common  variance  within  each 
cell.    That  is. 


HQ,   I  y^^v^)  -  N(v|  r,a') 

where  T  is  a  vector  of  parameters  corresponding  to  the  demographic  main 
effects  and  v.  is  a  vector  characterizing  the  status  of  examinee  i  on  those 
effects.    Each  demographic  variable  is  represented  by  between  one  and  four 
elements  in  T  and  v^ ,  depending  on  the  number  of  levels  used  for  that 
variable  in  the  coding  scheme. 

The  GROUP  program  uses  an  iterative  procedure  to  estimate  the  elements 
of  r  and  the  common  within  cell  variance  a  .    At  each  iteration,  the  normal 
distribution  of  reading  proficiency  in  each  cell  is  approximated  as  a 
histogram  over  40  equally-spaced  points  from  -4.875  to  +4.875.  Item 
parameters  are  assumed  to  be  known.    Sampling  weights  are  taken  into 
account.    Iteration  ends  when  the  largest  change  in  any  effect  is  less  than 
.01. 

Details  of  the  coding  scheme  developed  for  the  Year  15  Reading 
Assessment  are  given  in  Table  10.3(2).    As  indicated  in  the  table,  two 
different  methods  of  handling  missing  data  were  used.    For  some  background 
variables,  missing  values  were  treated  as  valid  responses,  that  is,  a 
particular  level  of  the  coded  variable  was  defined  to  include  missing 
values.    For  example.  Level  2  of  the  ethnicity  variable,  (  "White  and 
Other")  includes  missing  values.    Thus,  examinees  with  unknown  ethnicity 
were  included  in  the  estimation  of  the  "White  and  Other"  group  mean.  The 


304 


322 


Table  10.3(2) 


Coding  of  Background  Variables 
Year  15  BIB  Data 


Variable 

Looe 

Intercept 

1. 

All  subjects 

1 

Sex 

1 

naie 

0 

Femal e 

1 

Ethnfcity 

1. 

Black 

00 

9 

wniie  ana  utner 

10 

J  • 

nispani c 

Ul 

STOC 

1. 

Low  Metro 

00 

I. 

High  Metro 

10 

3. 

i^u  L  nx^ii  ui   LiO  ne 1 1 0 

Ul 

Region 

1. 

Not*  f  hp2)  c 

2. 

ppn  f ral 

lUU 

3. 

Southeast 

010 

4. 

West 

001 

Parental  F!d 

1 

X  • 

Licss  man  no 

000 

2. 

Hicrh  ^rhnn  i 

3. 

Beyond  HS 

010 

4. 

All  else 

001 

Grade/Age 

1. 

<  M  age>  =  M  grade 

0000 

2. 

=  M  age>  <  M  grade 

1000 

3. 

=  M  age,  =  M  grade 

0100 

4. 

=  M  age,  >  M  grade 

0010 

5. 

>  M  age,  =  M  grade 

0001 

Misc. 

1. 

Subjects  with 

1 

unrecoverable 

missing  data. 

Notes 


Subjects  with  missing 
values  excluded. 

Subjects  with  missing 
values  assigned  to  Level  2 


Subjects  with  missing 
values  assigned  to  Level  3 


Subjects  with  missing 
values  excluded. 


Subjects  with  missing 
values  assigned  to  Level  4 


Subjects  with  missing 
values  excluded. 
(M  =  modal) 


305 


ERIC 


second  method  of  handling  missing  values  was  developed  for  background 
variables,  such  as  sex  or  region,  for  which  no  single  level  could 
reasonably  be  defined  to  include  missing  values.  Examinees  with  missing 
values  for  these  other  background  variables  were  assigned  to  the  "Misc." 
effect  and  were  excluded  from  the  calculations  for  all  but  the  "Misc." 
group  mean. 

The  dataset  used  to  estimate  conditional  effects  included  all  who 
responded  to  at  least  one  calibrated  reading  item.    Table  10.3(3)  lists  the 
number  of  examinees  used  to  estimate  each  wi thin-grade/age  model  along  with 
the  estimated  conditional  effects.    An  estimate  of  the  common  within-cell 
variance  of  each  conditional  distribution  is  also  provided. 


10. 3. A    Generation  of  Plausible  Values 

A  plausibility  distribution  was  estimated  for  each  examinee  who  was 
administered  at  least  one  of  the  blocks  listed  in  Table  10.3(1).  These 
distributions  took  the  form  of  histograms  over  40  equally-spaced  val^gs  of 
reading  proficiency  between  -A. 785  and  +A.785.    The  density  of  the  q  bar 
of  the  histogram  estimated  for  the  i      examinee  was  obtained  as  follows: 


P(A^  I  x^,y^)  =  P(x^   I       =  A^,§)  P(A^  I  y^,a) 


P(A^   I  x  ,y  ) 

where 


the  proficiency  value  associated  with  the  qth  bar  of  the 
histogram,  typically  the  midpoint  of  the  interval; 

vector  of  observed  item  responses; 

responses  to  background  and  attitude  items; 

estimated  item  parameters; 

estimated  conditional  effects; 

I  9.  =  A  ,g)  gives  the  probability  of  observing  x^  given 
^       ^       proficiency  =  A^;  and 

I  y  ,a)  gives  the  conditional  probability  of  A  given 

^  "  background  variables  y .  . 


306 


A  = 
q 

X,  = 

§  = 
a  = 

P(x. 
P(A 


ERIC 


324 


Table  10.3(3) 


Estimated  Conditional  Effects 
Year  15  BIB  Data 


Effect 
Intercept 
Sex 

Ethnicity 

STOC 

Region 

Parental  Ed. 

Grade/ Age 

Misc. 


Level 
All  subjects 
Female 

White  and  Other 
Hispanic 

High  Metro 

Not  High  or  Lo  Metro 

Central 
South  East 
West 

High  School 
Beyond  HS 
All  else 

=  M  age,  <  M  grade 
=  M  age,  =  M  grade 
=  M  age,  >  M  grade 
>  M  age,  =  M  grade 

Subjects  with 
unrecoverable 
missing  values. 


Grade  4/ 
Age  9 

-1.350812 

0.096410 

0.460286 
0.076037 

0.490461 
0.243873 

-0.132867 
-0.008895 
-0.086579 

0.209282 
0.395126 
0.119694 

-0.671670 
-0.064834 
0.338318 
-0.307180 


Grade  8/ 
Age  13 

-0.432764 

0.139017 

0.402945 
0.112633 

0.307583 
0.122311 

-0.042057 
-0.020629 
-0.042722 

0.139972 
0.404412 
-0.017331 

-0.433070 
-0.012745 
0.548805 
-0.259528 


0.509864  -0.329341 


Grade  11/ 
Age  17 

0.159135 

0.159856 

0.405459 
0.134779 

0.229757 
0.147790 

0.027691 
0.023135 
0.005118 

0.081576 
0.379261 
-0.075156 

-0.616764 
-O.Ob3857 
0.076713 
-0.533380 

0.810939 


Number  of  Examinees 
Estimated  Variances 


Grad-?  4/ 
Age  9 

22,950 

0.46446 


Grade  8/ 
Age  13 

23,553 

0.38564 


Grade  11/ 
Age  17 

23,932 

0.45672 


307 


Five  plausible  values  were  obtained  for  each  examinee  by  sampling  at 
random  from  these  histograms.  For  each  plausible  value  generated,  a 
two-step  sampling  procedure  was  required.    In  the  first  step  of  the 
procedure,  a  single  random  digit  was  used  to  target  a  particular  block  of 
the  histogram.    In  the  second  step,  a  particular  value  of  reading 
proficiency  was  chosen  from  within  that  block  based  on  the  value  of  a 
second  random  digit.    Details  of  this  sampling  procedure  are  given  in  Table 
10.3(4). 


10.3.5    Effects  of  Specification  Errors  on  Plausible  Values 

Section  10.3.1.2  discussed  the  possibility  of  biases  in  secondary 
analyses  of  plausible  values  when,  during  the  construction  of  those 
plausible  values,  the  true  conditional  distribution  p(e|Y)  is  approximated 
by  some  simpler  approximation  P*(e|Y).    Of  particular  interest  is  the  case 
in  which  Y«(Y  ,Y  ),  and  p*(e|Y)«p(e|Y  )"that  is,  not  all  background 
variables  are^ncluded  in  the  conditioning  process— and  secondary  analyses 
address  the  joint  distribution  of  6  and  Y^.    This  section  describes  a 
simplified  setting  in  which  resulting  biases  can  be  derived  explicitly,  and 
employs  the  results  to  approximate  the  biases  that  would  result  in  analyses 
of  Year  13  NAEP  reading  plausible  values. 

The  conclusion  that  will  be  reached  is  that  secondary  analyses  of  NAEP 
Year  15  reading  plausible  values  that  involve  the  relationship  between 
reading  proficiency  and  non-conditioned  variables  (e.g.,  subgroup  means, 
regression  analyses,  and  path  analyses)  roust  be  interpreted  with  caution, 
because  the  strength  of  these  relationships  will  tend  to  be  underestimated 
by  amounts  that  depend  on  the  type  of  analysis  and  the  inter-relationships 
of  the  variables  involved.    Numerical  results  for  selected  analyses  are 
presented  below  in  Section  10.3.5.4.    The  strength  of  relationships  between 
reading  proficiency  and  conditioned  variables  only,  on  the  other  hand,  will 
not  be  underestimated.    Comparisons  of  regression  coefficients  or  multiple 
correlations  with  reading  proficiency  may  thus  prove  misleading,  to  a 
degree  whose  magnitude  is  suggested  by  a  number  of  examples  from  the 
reading  database. 

The  remainder  of  this  section  provides  the  foundation  upon  which  this 
conclusion  is  based. 

10.3.5.1    Setup  and  Notation 

As  mentioned  in  Section  10.3.1.2,  closed-form  expressions  for 
secondary  biases  under  the  IRT  model  used  in  the  NAEP  reading  analysis  are 
not  readily  forthcoming.    We  therefore  derive  results  for  a  related  but 
simpler  context,  namely  the  classical  true-score  measurement  niodel: 

X  =  e  +  €, 


308 


Table  10.3(4) 

Sawpling  Procedure  Used  to  Generate  Plausible  Valu*;s 


Step  1;      Obtain  a  random  numoer  r  from  the  unit  interval. 
Select^bar  k  from  the  histogram  estimated  for 
the  i      examinee  such  that 


k-l  40 


Jx  l^^ViXr  <  P(A^   I  x^,y,) 


where 


I  ^i^Vi)    =  Density  of  the  q^^  bar, 

with  value  A^,  £jr  q=l,...40 

=  vector  of  o'.  ^erveo  item 
responses,  and 

=  vector  of  responses  to 

background  ano  attitude  items 


Step  2:      Obtain  a  second  random  number  s  from  the  unit 
interval.    Compute  the  plausible  value  9  as 
follows: 


e.  =       +  .2(s  ~  .5) 


309 


where 

0  is  the  unobservable  variable  of  interest, 

X  is  the  Observable  variable,  and 

8  is  a  random  error  variable. 

The  following  distributional  assumptions  will  be  made: 

t  ^    N(0,  a^)  Cov(£:,e)=0 
e       N(u,  a^); 

it  follows  that  X  =  N(u,  a^)  with       =       +        ,  and  Cov(e,X)=  al  . 

The  following  normal  linear  regression  model  is  assumed  for  the 
examinee  population: 

e  =  p'Y  +  F, 

where  Y  is  a  K-dimensional  vector  of  background  variables,  with 
Y  =  MVN(0,E)  and  F  =i  N(0,a^,   )  where  o  .    =  a    -  fi' Zfi.  Note 
that  ^'^  ^'^  ® 

E(e|Y)  =     F(x!Y)  =    P'Y,  and 

Var    (xIY)  =  + 

^   >  '  E        e  I Y 

Define  the  '^conditional"  reliability  p  of  X  given  Y  as  follows: 

2      ,    f    2  2  I 

p    =a,/[a,+a  J. 
^         e  I Y     *  e  I Y      E  ^ 

2  2 

Note  that  0<p  <1  and  pa  ,        =  a  , 

^  X I Y  e  I Y 

In  a  generalization  of  Kelley's  (1947)  formula  (see  also  Box  & 
Tiao,  1973,  p.  7A),  we  find  that 

E(e|X,  Y)  =  pX  +  (1  -  p)  e'Y  and 

Var  (e|X,  Y)  =  (1  -  p)  a^j^  . 

That  is,  the  expected  value  of  the  unobservable  variable  of  interest,  9, 
given  the  values  of  observable  variables  X  and  Y,  is  a  weighted  average  of 
(i)  the  imperfect  manifestation  X,  in  the  proportion  that  it  is  "reliable," 
and  (ii)  the  expected  value  of  the  unobservable  variable  given  background 


310 


information  Y,  in  the  proportion  that  X  is  unreliable.  Note  that  9|X,Y 
follows  a  normal  distribution  under  our  simplifying  assumptions. 


10.3.5.2    Plausible  Values,  Complete  Conditioning 

Under  the  preceding  assumptions,  a  plausible  value  6  is  obtained  in 
the  following  manner: 


e  =  e  (X,  Y) 

=  E(e|X,  Y)  +  G 

=  pX  +  (1  -  p)  P'Y  +  G, 

where  G  is  a  random  number  selected  from  N(0,(l-p)a^     ),  independently  of 

X  and  Y.    The  following  properties  of  6  are  derived  in  Mislevy  (in 
progress) : 


(1)  E(e|y  =  y) 

=  E(e|y  =  y) 

=  6'y 

(2)  E(e) 

=  E(e) 

=  V 

(3)  Var  (e|y  =  y) 

=  Var  (e|y  =  y) 

2 

e  1 Y 

(4)  Var  (9) 

=  Var  (9) 

2 

=  a 

e 

These  results  indicate  that  the  expected  value  of  the  analyses  listed 
above,  when  carried  out  with  plausible  values  and  combinations  of 
observable  variables— is  identical  to  the  expected  value  when  carried  out 
with  e  itself— an  intrinsically  unobservable  variable.    Even  though  the  9 
values  of  specific  individuals  remain  unknown,  and  6  values  may  in  fact 
serve  poorly  as  estimates  for  individual  respondents,  the  method  of 
constructing  the  plausible  values  yields  the  correct  results  for  population 
characteristics. 


10.3.5.3    Plausible  Values,  Incomplete  Conditioning 

Suppose  that  Y  can  be  partitioned  into  two  subvectors,  Y    and  Y  . 
The  same  population  structure  holds,  though  it  may  be  rewritten  to  reflect 
the  partitioning  as 


E(e)  =  P'  Y    +  P'  Y 

11  2  2 


311 


and 


£  Z 

«  1112 

^  "      z  z 

71  22 


Define  the  projection  operators       =  ^21^22  ^2      ^12^11  '  ^1 

possess  the  following  properties: 

(1)  E(Y  |Y  =  y  )  =  P  y     and    B(Y  Iy^  =  yj  =  P  . 

—      2*1         1  2    1  —     1^    2  2  12 

(2)  The  j^**  diagonal  element  of  P  P    is  the  squared  multiple 
correlation  of  the  j^**  element  in  Y^    with  the  elements  of  Y^  • 


(3)  Regression  coefficients  for  9  and  X  on  Y^  alone  or  on  Y^ 
alone  can  be  expressed  as  follows: 

E(e|Y^=  y^)  =  E(X|Y^=  y^ )  =  p;  y^=  (p;  +      P^)  y^ 

and 

E(e|Y^=  y^)  =  E(X|Y^=  y^ )  =  e*'Y^^  (g^  +  g^P^)  y^  . 

(4)  If  Y    and  Y     are  orthogonal,  both  P^  and  P^    are  matrices  of 
zeroi^. 

(5)  Intuitively,  P  P  Y    yields  the  portion  of  Y    predictable  by 
Y  ;  a  similar  reia^onship  holds  for  the  relationship  of 

P^P  Y    to  Y  . 

212  1 

Suppose  now  that  plausible  values  §*  are  constructed  that  take  into 
account  the  relationship  of  9  with  Y^ ,  but  not  with  Y^.    That  is, 

e*  =  9*  (X,Y^,Y^) 
=  E(e|XJ^)  +  G* 


=  p*  X  +  (1  -p*)  e*'Y^  +  G*, 


where 

2 


p*  =  a  , 
^        e  Y 
'  1 


2  2 
a  ,     +  a 
6  y  E 
'  1 


312 

3oU 


and  G*  is  selected  at  random  from  N(0,(l-p*)a^      ),  independently 
of  X  and  . 

It  follows  immediately  from  the  results  of  the  preceding  subsection 

that 

(1)  E(e*|Y^=  y^) 

(2)  E(e*) 

(3)  Var  (e*|Y^=  y^ ) 

(4)  Var  (0*) 

These  analyses  involving  properties  of  t-^e  distribution  of  0  in  the 
population  at  large,  and  of  its  relationship  with  Y^ ,  have  the  same 

expected  value  when  carried  out  with  9*  as  with  6. 

Analyses  involving  Y    do  not  fare  as  well,  ho^rever.    Key  results, 
again  derived  in  Mislevy  (in  progress),  are  summarized  below. 

(1)  Whereas  E(e|Y^  =      '  ^2  =  ^2  ^  =      '        ^  ^2 '  ^2 
we  find  that 

E(e*|Y^=  y^,  Y^=  y2^  =     ^1  +  (i-p*)3;  P^y,  +  P*3;  (la) 

=  P;  y,  +  (l-P*)       E(YjY^=  y^)  +  p*3^y^  (lb) 

A  bias  thus  results  in  the  construction  of  a  plausible  value 
for  a  respondent  with  values  of  y^  and       on  the  background 
variables.    The  contribution  from       is  correct,  but  the 
contribution  from  Y    is  attenuated.    Rather  than  a 
contribution  from  that  person's  y    value,  we  obtain  a  weighted 
average  of  the  contribution  from  his  or  her  particular  y^  to 
the  extent  that  X  is  reliable,  but  from  the  expected  value  of 
Y^  given  his  or  her  particular  y    to  the  extent  that  X  is 
unreliable.    It  follows  from  (Ic)  that  this  bias  can  be  driven 
to  zero  in  three  different  ways: 

(i)  If  p*  =  1;  i.e.,  X  is  a  perfectly  reliable  measure?  of  9; 

(ii)  If  3^  =  0;  i.e.,  there  is  no  contribution  from 
anyway ; 


=    E(e|Y^=  y^)  =  p*'y^ 

=    E(e)  =  V 

=  Var  (0|Y  =  y  )  = 

'  1 

=  Var  (9)  = 


313 


XJ  I?  J. 


(iii)  If  P,y,  =  y,;  i.e.,  if  E(YjY,=y,)=y3 .  This  will  be 
true  for  all  y^  only  if  Y^  is  perfectly  predictable 
from  Y^ . 

Bias  in  the  expected  value  of  the  plausible  values  of 
individual  subjects  is  mitigated  as  any  or  all  of  these 
conditions  are  approached. 

(2)  Whereas    E(e|Y^=y^)  =  (p;  +  ^[^.^Vi  =         ^2  '  ^'"^^ 

E(e*|Y^=y^)  =  {p;  [p*  +  +  p;  ^,}y,  <2a) 

=  [P*'  -  P;  (1-P*)(I  -  P2Pi)iy2       •  ^^^^ 


As  in  (1)  above,  it  can  be  seen  in  (2a)  that  the 
contribution  relating  to  the  Y^  space  comes  through  correctly, 
but  the  contribution  of  the  Y^  space  is  again  the  average  of 
the  actual  y,  (to  the  extent  that  X  is  reliable)  and  just  the 
portion  of  y,  that  is  predictable  through  Y    (to  the  extent 
that  X  is  unreliable).    The  bias  term  is  rnduced  as  either  p* 
or  the  proportion  of  Y^  predictable  from  Y^  approaches  1. 


(3)  Whereas  the  regression  coefficient  for  y,  in  the  multiple 
regression  of  9  on  Y^  and  Y^  can  be  found  through 

E[e  -  E(e|Y^)  I  Y^=  yj  =  P;  y^* 

we  find  that 

Eie*  -  E(~e*  |Y^)|Y2=  yj  =     p*  d  -  p^^)  • 

Compared  with  the  desired  result,  namely  p^ ,  we  must  expect  a 
shrunken  answer  when  we  run  the  regression  with  §*.  Shrinkage 
is  mitigated  as  p*  approaches  1  and  as  P^P^  approaches  zero. 


(4)  Whereas  the  regression  coefficient  for  y    in  the  multiple 
regression  of  9  on  Y^  and  Y^  can  be  founa  through 

E[e  -  E(e|Y^)|Y^  =  yj  =  ^[y,, 

ve  find  that 

Eie*  -  E(e*|Yp|Y^=  yj  =  ip;  +  (i-P*)P;  p^ld  -  ^,^,)y,- 


314 


332 


Thus  bias  appears  in  the  multiple  refrression  copffioient  for 
Y     even  though  it  has  been  conditioned  upon,  unless  p*=l  and 


Two  aspects  of  these  results  have  sobering  implications  for  secondary 
analyses  of  plausible  values.    First,  while  higher  reliability  p*  is 
unequivocally  helpful,  high  shared  variance  between  Y    and  Y    is  not. 
High  shared  variance  mitigates  shrinkage  when  the  simple  regression  of  6 

on  Y    is  approximated  by  the  regression  of  0*  on  Y  ;  high  shared 

variance  exacerbates  shrinkage  with  respect  to  the^coef f icient  for  Y  when 

the  multiple  correlation  of  0  on  Y^  and  Y^  is  approximated  by  the  sa^ie 

analysis  of  9*. 

Second,  a  particularly  popular  form  of  secondary  analysis  is  threatened 
when  both  conditioned  and  non-conditioned  background  variables  are 
involved.    Specifically,  the  size  of  the  simple  regression  coefficient  of 
proficiency  on  a  given  background  variable  is  often  compared  with  the 
corresponding  partial  regression  coefficient  when  other  predictors  are 
included  in  the  model.    A  decrease  in  the  size  of  the  coefficient  of  the 
focus  variable  is  expected,  presumably  heading  from  3*  toward  3  as  more 
explanatory  variables  are  taken  into  account.    Result  (4)  above  indicates 
that  if  the  focus  variable  was  conditioned  upon  while  the  other  explanatory 
variables  were  not,  then  the  partial  regression  coefficient  for  the  focus 
variable  will  contain  a  contribution  properly  associated  with  the  other 
variables.    In  the  extreme  case  of  3  =0,  in  particular,  the  expected  result 
will  not  be  0.  ^ 

10. 3. 5. A    Approximating  Secondary  Biases  in  the  Analysis  of  Year  15  NAEP 
Reading  Plausible  Values 

The  analyses  above  assume  normality,  linearity,  knowledge  of 
population  coyariance  matrices,  and  an  observable  variable  X  that  is 
related  to  0  in  the  same  manner  for  all  respondents.    None  of  these 
assumptions  is  strictly  satisfied  in  the  NAEP  database.    They  may  prove 
useful  nonetheless  by  illustrating  the  order  of  magnitude  of  the  secondary 
biases  that  will  exist  in  secondary  regression  analyses  of  NAEP  reading 
plausible  values.    This  section  describes  how  one  may  compute  approximate 
"unshrunken"  coefficients  for  non-conditioned  variables  taken  one  at  a 
time,  both  in  simple  regression  analyses  and  in  multiple  regression 
analyses  with  the  entire  set  of  conditioning  variables.    The  steps  are  as 
follows: 

(1)  Calculate  an  approximate  p*,  or  p*.    An  approximation  computed  from 

numbe^r-right  scores  within  NAEP  booklets  can  be  averaged  over  booklets. 
This  value  will  tend  to  underestimate  the  precision  of  the  IRT 
analyses,  since  number-right  scores  capture  most,  but  not  all,  of  the 
information  available  in  response  patterns.    The  steps  to  be  carried 
out  in  each  booklet  are  as  follows: 


315 


(a)  Calculate  a  reliability  coefficient,  say  Cronbarh's  alpha,  for 
percent-correct  scores  X  (with  omits  and  not-reached  treated 
as  incorrect).    This  yields 


ERLC 


2 

2  a  ,     +  a 

e  2  e 


2  2  2  2  2 

e         E  e  Y  E 


where       is  the  variance  of    e:(e|Y^),  with       being  the  NAEP 


conditioning  variables 


(b)  Compute  the  proportion  of  variance  in  X  accounted  for  by 
by  standard  PMOVk  procedures: 


.1 


2  2  2  2 

a*  +  a     +    a  o 


e      e  Y 


2  2 

since  a*      -  a- 


E  X 


(c)  Because    p*  s  a^.      ^^^e|Y        ^e  ^' 

'  1  1 


P*  =    p  -  R« 


1  >  R  ,' 


e 


Values  of  p*  as  calculated  from  €:dch  of  the  NAEP  booklets  are  shown  in 
Tables  10  3(5),  10.3(6)  and  10  3(7). 

Average  per-booklet  reliability  coefficients  (rho)  are  .82,  .75,  and 
.77  for  Grade/Ages  4/9,  8/13,  and  11/17  respectively;  corresponding 
average  proportions  of  variance  of  percent-correct  scores  explained  by 
conditioning  variables  are  .26,  .25,  and  .28;  and  reliabilities  after 
partialling  out  conditioning  variables  (rho-star)  are  .75,  .67,  and 
.68.    Recall  that  it  is  these  latter  values  that  set  the  tone  for  the 
magnitude  of  shrinkage  effects  to  be  expected  in  secondary  analyses  of 
non-conditioned  variables.    A  fairly  strong  relationship  will  be  noted 


316 


334 


Table  10.3(5) 


Reliability  Coefficients  by  Booklet 
Grade  4/Age  9 


BOOKLET 

#  ITEMS 

RHO 

R-2 

RHO- STAR 

1 

7 

0.79 

0.25 

0.72 

2 

20 

0.87 

0.29 

0.82 

5 

10 

0.75 

0.19 

0.69 

6 

10 

0.7:6 

0.18 

0.71 

7 

35 

0.92 

C.37 

0.87 

8 

24 

0.83 

0.21 

0.79 

9 

30 

0.90 

0.37 

0.84 

11 

12 

0.71 

0.23 

0.63 

12 

10 

0.77 

0.21 

0.71 

13 

23 

0.81 

0.27 

0.75 

14 

21 

0.87 

0.24 

0.83 

15 

12 

0.65 

0.18 

0.57 

16 

18 

0.84 

0.19 

0.80 

17 

22 

0.87 

0.28 

0.83 

18 

13 

0.82 

0.30 

0.74 

19 

18 

0.82 

0.28 

0.76 

20 

12 

0.79 

0.21 

0.74 

21 

12 

0.80 

0.31 

0.72 

22 

32 

0.90 

0.34 

0.85 

23 

17 

0.86 

0.28 

0.80 

24 

12 

0.75 

0.21 

0.68 

25 

28 

0.87 

0.27 

0.83 

26 

11 

0.76 

0.25 

0.68 

27 

20 

0.81 

0.31 

0.73 

28 

20 

0.78 

0.22 

0.71 

29 

22 

0.83 

0.31 

0.75 

30 

20 

0.87 

0.34 

0.81 

31 

21 

0.82 

0.26 

0.76 

32 

19 

0.84 

0.40 

0.74 

33 

23 

0.83 

0.33 

0.74 

34 

11 

C.66 

0.25 

0.54 

35 

23 

0.87 

0.25 

0.82 

36 

11 

0.55 

0.27 

0.38 

37 

11 

0.70 

0.21 

0.63 

38 

22 

0.80 

027 

0.73 

39 

13 

0.82 

0.31 

0.74 

40 

7 

0.78 

0.23 

0.71 

41 

31 

0.88 

0.33 

0.83 

42 

9 

0.72 

0.18 

0.65 

43 

9 

0.76 

0.21 

0.70 

45 

24 

0.88 

0.30 

0.84 

46 

12 

0.84 

0.22 

0.79 

317 


Table  10.3(5) 
(continued) 


BOOKLET 

#  ITEMS 

RHO 

R-2 

RHO-STAR 

47 

24 

0.88 

0.29 

0.83 

48 

12 

0.80 

0.26 

0.73 

49 

36 

0.87 

0.27 

0.82 

50 

25 

0.85 

0.24 

0.80 

CI 
Jl 

2i 

0.87 

0. 36 

0.80 

53 

35 

0.92 

0.31 

0.89 

54 

11 

0.74 

0.23 

0.66 

55 

24 

0.88 

0.25 

0.84 

56 

23 

0.89 

0.30 

0.84 

57 

22 

0.87 

0.32 

0.81 

58 

18 

0.84 

0.24 

0.79 

59 

20 

0.87 

0.25 

0.82 

60 

21 

0.85 

0.27 

0.79 

61 

10 

0.75 

0.19 

0.70 

63 

11 

0.80 

0.23 

0.75 

MEAN 

18.46 

0.82 

0.26 

0.75 

318 


Er|c  33d 


Table  10.3(6) 


Reliability  Coefficients  by  Booklet 
Grade  8/Age  13 


BOOKLET       #  ITEMS  RHO  R-2  RHO-STAR 

J  6  0.56         0.20  0.45 

2  15  0.79        0.37  0.66 

5  12  0.68        0.17  0.62 

5  12  0.66         0.16  0.60 

I  31  0.89        0.20  0.86 

8  22  0.85  0.27  0.80 

9  28  0.86  0.26  0.82 
11  11  0.67  0.20  0.58 
}l  12  0.71  0.15  0.65 
13  19  0.79  0.22  0.73 
1^  22  0.84  0.35  0.76 

15  11  0.62  0.27  0.^8 

16  23  0.79  0.28  0.70 

17  23  0.83  0.27  0.77 

18  9  0.69  0.24  0.60 

19  14  0.76  0.27  0.67 

20  12  0.81  0.30  0.73 

21  12  0.78  0.26  0.70 

22  37  0.86  0.33  0.80 
2'  18  0.75  0.19  0.70 
2^  11  0.75  0.20  0.69 

25  27  0.85  0.28  0.79 

26  17  0.70  0.19  0.62 

27  18  0.80  0.23  0.74 

28  20  0.76  0.32  0.65 

29  27  0.77  0.30  0.68 
27  0.82  0.34  0.72 
22  0.75  0.25  0.66 

-2  17  0.76  0.22  0.70 

33  21  0.83  0.29  0.76 

10  0.50  0.25  0.34 

35  28  0.82  0.35  0.72 

36  10  0.51  0.26  0.34 

37  8  0.67  0.25  0.55 

38  18  0.72  0.27  0.62 

39  9  0.59  0.20  0.49 
*0  6  0.50  0.15  0.41 
*1  33  0.86  0.2i  0.82 
^2  10  0.64  0  26  0.51 
*3  10  0.67  0.29  0.53 
*5  17  0.79  0.27  0.71 
*6  12  0.79  0.29  0.71 


319 


Table  10.3(6) 
(continued) 


BOOKLET 


47 

48 

49 

50 

51 

53 

54 

55 

56 

57 

58 

59 

60 

63 


MEAN 


#  ITEMS  RHO 

26  0.81 

11  0.77 

30  0.80 

20  0.81 

28  0.80 
33  0.87 

8  0.69 

23  0.85 

29  0.82 
19  0.75 
22  0.81 
18  0.85 
11  0.74 

6  0.73 

18.05  0.75 


R-2  RHO- STAR 

0.33  0.72 

0.20  0.71 

0.25  0.74 

0.27  0.74 

0.31  0.70 

0.33  0.81 

0.18  0.63 

0.27  0.79 

0.29  0.74 

0.31  0.64 

0.25  0.74 

0.27  0.79 

0.15  0.69 

0.17  0.68 

0.25  0.67 


320 


33d 


Table  10.3(7) 


Reliability  Coefficients  by  Booklet 
Grade  11 /Age  17 


TTI?MC 

RHU 

R-Z 

RHO -STAR 

1 

1 

z; 
D 

U  .  Di 

n  on 

f\  CI 

1  A 
ID 

U .  oU 

U.  ZD 

f\  TO 
U.  /j 

1  9 

n  7*\ 

U.  jl 

U.  d4 

D 

1  9 

n  7*\ 

n  on 

U.  69 

7 

0  7 

c\  on 
u .  yu 

0.  o4 

p 
o 

1  Q 
io 

U  •  oZ 

C\  OA 

A  TO 

U.  73 

0 

11 
Ji 

U  •  oO 

n  0^ 

yj.77 

1 1 

1  1 

1 1 
1 1 

n  71 
U  t  /  i 

n  0  "7 
U.  Z / 

U .  dU 

1  9 

1  9 

n  79 

n  oc 
U.  J!) 

U.  3/ 

1 1 

1  Q 

f\  DO 

A    O  O 

A  TO 

0,  7Z 

1  A 

1  7 

U  •  oj 

A    O  / 

A  TT 

1  s 

1  1 
1 1 

U .  Do 

C\    0  1 

A  CO 

Uoo 

1  A 
ID 

1  7 

U .  /U 

f\    0 1 

U.  Jl 

A    C  T 

0.57 

1  7 

1  Q 

rt  Ol 

U  •  o  J 

n    0  T 

U,  J/ 

A    1  / 

0.  74 

1  ft 
iO 

1  n 

U .  /o 

n   0  ^ 

A  TA 

0.  70 

1  Q 

1  A 

n  7Q 
U .  /o 

n  0  7 

A  TA 

0.  70 

90 

1  9 

U  •  oU 

n  01 

A    T 1 

0.  /I 

91 

1  9 

U.  /o 

n  00 

A  A7 

99 

U  •  OD 

(\  OQ 

U«  Zo 

A    0 1 

0  •  ol 

1  Q 

io 

n  71 

n  0  *\ 

A    A  A 

0 .  d4 

9  A 

7 

n  79 

n  0  0 
U.  15 

A  AO 
0  .  Dj 

9S 

1  Q 

io 

n  on 
u .  oU 

(\    0  Q 

u.  zy 

A    T 1 

0  •  /I 

9fi 

1  1 
i  i 

U  .  DO 

n  0  ^ 

A  CT 
0.0/ 

97 

1  7 
i  J 

n  77 

n  OA 
U .  Z4 

A  "lA 

0 .  /O 

9ft 

io 

n  77 

n  0  Q 

u.  zy 

A  AO 
0  .  DO 

2Q 

9A 

u  •  O  i 

n  AH 

A  AO 
0  .  DO 

1  A 
i  D 

n  7n 

n  OQ 
U .  ^.o 

n  CO 

u .  jy 

J  1 

9S 

U  •  O  J 

n  OQ 
u .  jy 

n  70 
U.  /Z 

19 

1  7 
i  / 

n  7A 

U .  /D 

n  OA 

U.  ZD 

n  A7 

U  .  D  / 

J  J 

9n 

u  •  o3 

n  on 
U.  jU 

A  "IQ 

u .  /y 

'^A 

1 

n  71 

u  •  /  i 

n  OA 

U .  ZD 

n  AH 
U.  DU 

35 

22 

0.82 

0.29 

0.75 

36 

13 

0.74 

0.29 

0.64 

37 

8 

0.67 

0.29 

0.53 

38 

21 

0.78 

0.36 

0.65 

39 

10 

0.70 

0.25 

0.61 

40 

6 

0.62 

0.13 

0.56 

41 

28 

0.85 

0.30 

0.79 

42 

5 

0.41 

0.15 

0.30 

43 

5 

0.54 

0.19 

0.43 

45 

18 

0.81 

0.19 

0.76 

46 

12 

0.82 

0.30 

0.74 

321 

3  3  J 

o 

ERIC 


Table  10.3(7) 
(continued) 


BOOKLET 

#  ITEMS 

RHO 

F,-2 

RHO- STAR 

47 

21 

0.78 

0.33 

n  CI 

48 

7 

0. 72 

0»  19 

U»DD 

49 

34 

0.89 

0.32 

0.84 

50 

17 

0.80 

0.24 

0.74 

51 

18 

0.80 

0.35 

0.69 

53 

34 

0.90 

A    0  T 
U.  j/ 

A  OK 

54 

8 

0.78 

0.27 

0.70 

55 

23 

0.84 

0.30 

0.77 

56 

23 

0.80 

0.39 

0.68 

57 

15 

0.79 

0.26 

0.72 

58 

17 

0.80 

0.20 

0.75 

59 

18 

0.85 

0.26 

0.80 

60 

7 

0.74 

0.23 

0.67 

63 

6 

0.79 

0.10 

0.77 

HEAN 

16.13 

0.77 

0.28 

0.68 

322 


between  the  numbers  of  items  in  booklets  and  their  reliability 
coefficients. 

(2y  For  a  given  background  variable  Y    (or  a  single  contrast  involving  Y 
if  it  is  a  categorical  variable  with  several  categories) ,  computr  th4 
squared  multiple  correlation  R    between  Y    and  Y  . 

Y  2  1 

(3)  Run  the  multiple  regression  analysis  for  plausible  values  that  includes 
Y    and  Y^  ^s  predictors.    The  expected  coefficient  for  Y    is  B  - 

p    P*  (1-R  ).    From  this,  one  can  calculate  an  "unshrunk4n"  estimate 
of  the  partial  regression  coefficient: 

=     /p*  (1-  r'). 

(4)  Run  the  simple  regression  of  9*  on  Y    .    The  expectec^  coefficient  is 
B'*  «  p*'  -       (l-p*)(l-R  ).    From  r^is,  and  the  result  of  Step  3,  one 
can  calculate  an  "unshrunRen"  estiniate  of  the  simple  regression 
coefficient  for  Y^  : 

e* « B*  +  (i-p*)(i-R^). 

Tables  10.3(8)  through  10.3(10)  carry  out  these  steps  for  a  number  of 
non-conditioned  NAEP  varU.les  in  the  three  grade/age  samples.    The  column 
labeled  "R-square"  indicate:;  the  proportion  of  variance  in  the  focal 
non-conditioned  variable  that  is  accounted  for  by  the  conditioning 
variables:  that  is,  R  .    The  columns  headed  'Multiple  regression"  concern 
the  estimation  of  a  regression  coefficient  for  the  focal  variable  in  a 
multiple  regression  equation  containing  it  and  all  conditioning  variables 
as  predictors  of  proficiency.    The  columns  headed  ♦♦Simple  regression ♦♦ 
concern  the  simple  regression  of  proficiency  upon  the  focal  variable  alone. 

Results  for  the  simple  regressions  indicate  shrinkages  from  about  5  to 
15  percent  I  with  the  average  about  10  percent.    As  expected,  shrinkages  of 
this  type  are  smallest  when  the  focal  variable  exhibits  comparatively 
higher  shared  variation  with  the  conditioning  variables.    The  percentage  of 
pupils  in  a  respondent's  school  that  participate  in  federal  lunch  programs 
is  an  example.    This  variable  is  related  to  ethnicity,  parents'  education, 
and  type  of  community,  resulting  in  an  R-square  of  about  .2.  Simple 
regression  shrinkage  is  thus  minimal  for  this  variable--about  5  percent. 

Results  for  the  multiple  regressions  indicate  shrinkages  between  25  and 
45  percent,  with  the  average  about  35  percent.    Nov,  variables  with  high 
shared  variation  exhibit  greater  shrinkages.    The  shrinkage  for  "percent  in 
lunch  program"  is  38  percent  for  the  youngest  subsample  (where  reliability 
is  highest)  and  about  45  percent  for  the  two  older  subsamples. 

To  repeat  the  introduction  to  this  section,  we  arrive  at  two 
conclusions.    First,  secondary  analyses  involving  relationships  between 
NAEP  Year  15  reading  plausible  values  and  conditioning  variables  provide, 


323 


Table  10.3(8) 

Approximate  Shrinkage  ot  ^egression  Coefficients  of 
Non-conditioned  Backgrornd  Variables:  Grade  4/Age  9 


Multiple  Regression  Simple  Regression  

Effect  R-square*     B            Beta    %  Shrink  B-star    Beta-star  X  Shrink 

Hours  TV  .OA         -.79       -1.10  27.78  -1.63       -1.89  13.93 

Papers  read  .03        4.92         6.75  27.03  10.18       11.81  13.81 

%  pupils  in 

lunch  pgm  .18         -.07         -.11  38.03  -.23         -.25  8.96 

MirD-ity 

school  .11       -1.20       -1.80  33.28  -9.51       -9.90  4.01 

School 

problems  .07         1.93         2.78  30.36  5.28        5.92  10.82 

Minority 

reading  pgm  .05      -11.92      -16.73  28.77  -34.03       37.98  10.40 

Classes 

taker  .03          .54          .74  27.30  2.14        2.32  7.67 


*  Proportion  of  variation  of  focal  effect  accounted  for  by  conditioning 
variables 


324 


Table  10.3(9) 


Approximate  Shrinkage  of  Regression  Coefficients  of 
Non-conditioned  Background  Variables:  Grade  8/Age  13 


Multiple  Regression  Simple  Regression 

Effect  R-square^'     B  Beta    X  Shrink          B-star    Beta-star  X  Shrink 

Hours  TV  .03          .11  .17       3«.89  .33  .39  U.22 

Papers  read  .03        6.26  9.65       35.07             12.33       15.41  19.96 
X  pupils  in 

lunch  pgm  .20        -.02  -.04       46.25  -.19         -.20  5.94 

Minority 

school  .15         1.63  2.86       43.04             »3.88       -3.08  -25.92 
School 

problems  .07         1.40  2.25       37.81  3.26         3.94  17.46 

Minority 

reading  pgm  .07       -3.72  -5.97       37.69            -27.14      -28.97  6.30 
Glasses 

taken  .08        1.75  2.84       38.42  3.69        4.55  18.87 


*  Proportion  of  variation  of  focal  effect  accounted  for  by  conditioning 
variables 


325 


Table  10.3(10) 

Approximate  Shrinkage  of  Regression  Coefficients  of 
Non-conditioned  Background  Variables:  Grade  11/Age  17 


Multiple  Regression  Simple  Regression 

Effect  R-square*      B  Beta  X  Shrink  B-star  Beta-star  X  Shrink 

Hours  TV  .06       -1.06  -.65  36.03  -3.20  -3.70  13.42 

Papers  read  .02         9.80  14.73  33.49  15.03  19.63  23.43 

LSch'pgm'"  .20        -.03  -.06  45.28  -.30  -.31  4.59 

IcZZ'''  .14        1.55  2.66  41.69  -7.99  -7.26  -10.03 

ItZLs  .12         1.17  1.94  39.81  5.26  5.80  9.44 

JeaSin^Pgn.  .01       -6.45  -9.60  32.81  -33.70  -36.73  8.25 

Academic  .18         1.10  1.98  44.19  2.80  3.32  15.62 
courses 


*  Proportion  of  variation  of  focal  effect  accounted  for  by  conditioning 
variables 


326 


ERIC  344 


by  construction^  consistent  estimates  of  modeled  effects.    Second^  in  the 
way  of  contrast ^  analyses  that  involve  the  relationship  of  reading 
proficiency  and  non-conditioned  variables  must  be  interpreted  with  caution^ 
because  the  strength  of  these  relationships  will  tend  to  be  underestimated. 
The  underestimation  is  least  serious  when  only  the  margin  of  a  single 
unconditioned  variable  is  addressed  (about  10  percent  on  the  average) ^  but 
more  serious  when  higher  order  features  of  the  joint  relationship  of 
proficiency ^  conditioning  variables ^  and  unconditioned  variables  is 
addressed^  as  in  multiple  regression  (35  percent  on  the  average). 

10.3.6    BIB/Pace  Equating 

In  addition  to  the  reading  responses  solicited  un»  er  BIB  administration 
conditions  during  the  Year  15  assessment >  responses  were  solicited  from 
national  probability  samples  of  age-eligible  pupils  under  the  pace 
conditions  that  characterized  past  NAEP  assessments.    These  pace  data  play 
a  pivotal  role  in  NAEP.    Because  they  were  obtained  under  the  same 
administration  conditions  as  data  from  past  assessments,  they  make  possible 
the  continuation  of  unbroken  trend  lines  from  the  past.    Because  they  were 
obtained  from  samples  of  pupils  that  were  randomly  equivalent  to  those  of 
the  age-based  BIB  samples,  they  make  possible  the  linkage  of  BIB  and  pace 
results.    This  section  describes  the  steps  that  were  taken  to  link  the  Year 
15  pace  results  into  the  Year  15  BIB  scale  discussed  above.    Chapter  10.4 
will  describe  the  analysis  of  trends,  from  past  assessments  through  the 
Year  15  pace  assessment. 


10.3.6.1    Percent-Correct  Results 

Before  discussing  the  IRT  procedures  employed  to  equate  the  spiralled 
and  paced  data,  it  is  useful  to  examine  descriptive  characteristics  of  the 
data  of  the  two  types  in  terms  of  percentages  of  correct  responses.  In 
this  more  familiar  metric,  the  reader  may  more  easily  judge  for  himself  or 
hersel''  the  magnitude  of  the  effect  of  administration,  its  consistency  over 
items,  and  the  degree  of  differential  effects  among  gender  and  ethnicity 
groups. 

Figures  10.3-3  through  10.3-8  plot  weighted  percents-correct  from 
spiralled  administration  against  those  of  paced  administration  for  the  same 
item  among  Age  9  whites,  blacks,  Hispanics,  males,  and  females.    Only  those 
items  included  in  IRT  scaling  are  shown;  plots  incorporating  non-IRT  items 
are  also  available  from  ETS  upon  request.    Figures  10.3-9  through  10.3-14 
present  similar  results  for  Age  13,  and  Figures  10.3-15  through  10.3-20 
present  results  for  Age  17. 

Inspection  of  these  plots  reveals  only  modest  administration  effects, 
mainly  linear  and  similar  for  most  items.    Due  to  the  greater  precision  of 
their  percent-correct  values,  plots  involving  the  larger  groups  (male, 
female,  and  white)  exhibit  less  scatter  than  plots  for  smaller  groups 
(black  and  Hispanic).    More  detailed  comparisons  can  be  obtained  from  Tabic 
10.3(11),  which  presents  correlations  among  percents-correct  under  the  two 


327 


Figure  10-3-3 


BIB-PACE  %  CORRECT 
AGE  09   -  TOTAL 
IRT  ITEMS 


1.00  ' 


328 


31d 

ERIC 


Figure  10.3-4 


ICO  -I 


O 
< 


BIB-PACE  %  CORRECT 
AGE  09  -  MALE 
IRT  ITEMS 


BIB 


329 


ERIC 


34  V 


Figure  10.3-5 


BIB-PACE  %  CORRECT 
AGE  09   -  FEMALE 
IRT  ITEMS 


o 


40  60  80  XOO 

BIB 


330 


Figure  10.3-6 


ERIC 


331 


Figure  10.3-7 


BIB-PACE  %  CORRECT 
AGE  09  -  BLACK 
IRT  ITEMS 

xoo  -   ^  


80- 


60 

Ld 
O 
< 
£L 

40 


20 


I  I  I  <  I 

20  40  60  80  iOO 

BIB 


332 


ERIC  Sou 


Figure  10.3-8 


BIB-PACE  %  CORRECT 
AGE  09  -  HISPANIC 
IRT  ITEMS 

t"TT  J   


333 


ERIC 


35j. 


Figure  10.3-9 


BIB-PACE  %  CORRECT 
AGE  13  -  TOTAL 
IRT  ITEMS 


20  40  60  80  lOO 

BIB 


334 


Figure  10.3-10 


BIB-PACE  %  CORRECT 
AGE  13  -  MALE 
IRT  ITEMS 


40  60  60 


BIB 


335 


Figure  10.3-11 


BIB-PACE  %  CORRECT 
AGE  13  -  FEMALE 
IRT  ITEMS 


BiB 


336 


ERIC 


35a 


Figure  10.3-12 


BIB-PACE  %  CORRECT 
AGE   13  -  WHITE 
IRT  ITEMS 


Figure  10.3-13 


BIB-PACE  %  CORRECT 
AGE  13  -  BLACK 
IRT  ITEMS 


40  60  80  100 


BIB 


338 


35u 


Figure  10.3-14 


BIB-PACE  %  CORRECT 
AGE   13  -  HISPANIC 
IRT  ITEMS 


1^ 

20 


40 


60 


80 


100 


BIB 


339 


Figure  10.3-15 


BIB-PACE  %  CORRECT 
AGE  17  -  TOTAL 
IRT  ITEMS 


100 


20- 


20  40  60  80  100 

BIB 


340 

o 

ERIC 


Figure  10.3-16 


BIB-PACE  %  CORRECT 
AGE  17  -  MALE 
IRT  ITEMS 

100 


80 


60 

O 
< 
Q. 

40 


20 


n  I  \  \  ( 

20  40  60  80  XOO 

BIB 


341 


Figure  10.3-17 


BIB-PACE  %  CORRECT 
AGE  17  -  FEMALE 
IRT  ITEMS 


20  40  60  80  XOO 


BIB 


342 

3o\) 


Figure  10.3-18 


100 


LU 
O 
< 
0- 


BIB-PACE  %  CORRECT 
AGE  17  -  WHITE 
IRT  ITEMS 


xoo 


BIB 


343 


ERIC 


JOJL 


Figure  10.3-19 


BIB-PACE  %  CORRECT 
AGE  17  -  BLACK 
IRT  ITEMS 


20  40  60  aO  100 


BIB 


344 


362 


Figure  10.3-20 


BIB-PACE  %  CORRECT 
AGE  17  -  HISPANIC 
IRT  ITEMS 


100 


I 

20 


40 


60 


80 


I 

100 


BIB 


345 


Table  10.3(11) 


Correlations  and  Regression  Coefficients  for  Spiral 
vs.  Pace  Percents-Correct  of  IRT  Reading  Items 


Regression  Coefficients* 


Group 


Age  9 


Age  13 


Age  17 


Correlation 

Intercept  (se) 

Slope 

( se) 

Total 

.91 

1  flR 

^  J. ji ; 

1.02 

(.06) 

Male 

.  91 

1  an 

I  .  DO 

(J.Z.O) 

1.01 

(.06) 

Female 

.  92 

.  /  J 

^  J. 

1.02 

(.06) 

W 11  J.  L  C 

.91 

2.35 

(3.63) 

1.01 

(.06) 

Black 

.89 

1.60 

(3.10) 

1.01 

(.06) 

Hispanic 

.92 

.22 

(2.65) 

(  .UD) 

Total 

.99 

.67 

(1.19) 

1.01 

(.02) 

Hale 

.99 

68 

(1.31) 

1.02 

(.02) 

Female 

.99 

1.07 

(1.31) 

1.01 

(.02) 

White 

.99 

1.23 

(1.24) 

1.02 

(.02) 

Black 

.97 

1.74 

(1.75) 

1.01 

(.03) 

Hispanic 

.97 

.91 

(1.91) 

1.01 

(.03) 

Total 

.99 

3.79 

(1.21) 

1.00 

(.02) 

Hale 

.99 

2.46 

(1.28) 

1.02 

(.02) 

Female 

.99 

5.54 

(1.67) 

.97 

(.02) 

White 

.99 

5.80 

(1.05) 

.97 

(.01) 

Black 

.97 

1.46 

(2.22) 

1.05 

(.04) 

Hispanic 

.97 

.26 

(2.53) 

1.03 

(.04) 

*Regression  of  paced  percent-correct  on  spiralled  percent-correct, 
percentage-point  units. 


346 


ERIC 


3b' 


administrations^  and  regression  lines  predicting  paced  percent-corrects 
from  spiralled  percents-correct.    The  main  results  are  as  follows. 

(1)  While  the  effect  of  mode  of  administration  varies  somewhat 
from  item  to  item^  the  average  efiect  was  for  items  to  be 
slightly  easier  under  paced  conditions  than  BIB.    As  indicated 
by  the  intercept  coefficient ^  the  size  of  the  effect  was  on 
the  order  of  1  percentage  point  in  Ages  9  and  13 ^  and  was  not 
statistically  significant;  it  was  about  A  percentage  points 
for  age  seventeen^  and  was  statistically  significant. 

(2)  Item-by-adminii?tration  interactions  were  negligible  at  Ages  13 
and  17  (as  evidenced  by  correlations  of  .99) >  but  were 
somewhat  more  apparent  at  Age  9  (a  correlation  of  .91). 
Patterns  of  interactions^  as  seen  in  the  plots^  are  similar 
across  gender  and  ethnicity  groups. 

(3)  No  significant  gender-by-administration  or  ethnicity-by- 
administration  interactions  were  observed  in  Ages  9  or  13.  At 
Age  IJf  however^  the  effect  of  pacing  favored  females  over 
males  (5.5  percentages  points  to  2.5)^  and  whites  over  blacks 
and  Hispanics  (5.8  percentages  points  to  1.5  and  .3). 

This  last  finding  suggests  that  the  change  from  paced  tape 
administration  had  little  effect-  if  any?  on  the  relative  differences  in 
performance  among  subgroups.    In  particular^  the  new  procedures  do  not 
appear  to  have  had  a  detrimental  effect  on  the  performance  of  black  and 
Hispanic  students. 


10.3.6.2    IRT  Equating  Procedures 

The  three-parameter  logistic  IRT  model  was  employed  for  the  Year  15  BIB 
data  and  for  pace  data  (Year  15  pace^  plus  all  past  reading  assessments). 
For  reasons  to  be  made  clear  below^  the  pace  scaling  was  carried  out 
separately  within  each  age.    There  is  no  guarantee  that  the  reading 
proficiency  variables  defined  in  these  separate  analyses  are  in  fact 
measuring  exactly  the  same  skill.    In  fact^  there  is  reason  to  suspect  they 
are  not>  since  under  BIB  conditions  the  pupils  had  to  read  directions  and 
control  the  amount  of  time  they  spent  on  each  item  for  themselves. 

The  practical  question  is  whether  a  reasonably  straightforward 
relationship  can  be  maintained  between  the  variables >  such  that  the  same 
family  of  IRT  models  and  the  same  reporting  scale  can  be  used  for  BIB  data 
and  pace  data  from  all  ages  and  years.    If  the  same  IRT  model  is  to  be 
used^  the  form  of  the  three-parameter  logistic  model  restricts  the 
relationship  among  the  variables  measured  under  BIB  and  under  pace  at  the 
three  ages  to  be  linear.    That  is>  if  the  probability  that  per:^on  i  from 
age  group  k  will  answer  item  j  correctly  under  BIB  conditions  Is  given  by 

P(X.  .  =  l|a.    b     c.    e.)  =  c    +  (1-c  )  y  I1.7a  (9  -  b  )]  , 


347 


then  the  probabi]Uy  of  a  correct  response  under  pace  conditions  must  be  of 
ine  form 

^1^.  ^i.         V  =  ^      (l-c.)1'{1.7(a./M^)[e.^  (M^b^+  X^)])  , 
where       and       are  constants  that  apply  to  all  items  in  the  domain. 

Allowing  different  constants  M  and  X  for  different  age  groups  allows 
for  the  possibility  that  BIB/pace  differences  may  interact  with  age.  To 
the  degree  that  such  interactions  exist,  pace  results  from  different  ages 
translated  to  the  BIB  scale  may  not  be  strictly  comparable.  Comparisons 
within  ages  are  comparable  over  all  time  points,  however,  as  are 
comparisons  within  age  from  BIB  to  pace. 

The  assumption  of  a  linear  relationship  on  the  9  scale  between  BIB  and 
pace  data  can  be  estimated  by  use  of  the  randomly  equivalent  BIB  and  pace 
age  samples  at  each  age  lavel.    One  way  that  this  can  be  accomplished  is  to 
fit  the  three-parameter  model  to  the  BIB  and  pace  data  separately  (as  was 
in  fact  done),  and  translate  the  pace  scale  (from  its  arbitrary  origin  and 
unit-size)  in  a  manner  that  matches  the  first  two  moments  of  the  age 
distribution  to  that  obtained  from  the  BIB  sample. 

To  this  end,  the  model  was  fit  to  the  Year  15  BIB  data  in  the  manner 
describ»2d  in  the  preceding  sections,  and  to  all  pace  data  (Year  15  pace 
plus  all  past  assessments)  for  aach  age  separately  in  the  manner  described 
in  the  following  section.    The  distributions  of  the  Year  15  BIB  samples  in 
each  age  (on  the  Year  15  reading  proficiency  9  scale),  and  of  each  Year  15 
pace  age  samples  (on  a  provisional  scale  with  an  arbitrary  origin  and 
unit-size)  were  estimated  by  means  of  the  computer  prograni  RESOLVE 
(Mislevy,  1985c),  which  provided  a  non-parametric  approximation  of  the 
latent  9  distribution  in  terms  of  a  histogram.    It  should  be  noted  that 
this  procedure  estimates  the  latent  distribution  directly  rather  than  using 
point  estimates  for  individuals,  thereby  avoiding  difficulties  associated 
with  infinite  estimates  for  aberrant  response  patterns  and  with  differing 
precision  of  tests  with  different  lengths  (see  Mislevy,  1984b,  for 
details) • 

It  is  important  to  note  that  the  assumption  of  a  linear  relationship 
can  be  checked  in  three  ways.    Matching  the  first  two  moments  of  the 
distributions  ensures  that  these  two  characteristics  of  the  BIB  and  pace 
data  will  agree;  other  characteristics  have  not  been  constrained  to  match, 
but  should  match  fairly  well  if  the  assumption  is  reasonable.  After 
matching  moments  in  the  manner  described  above,  three  checks  of  the 
veracity  of  the  linearity  assumption  were  performed: 


(1)  Match  or  distributional  shape.    Distributions  with  identical 
means  and  variances  can  differ  considerably  with  respect  to 


348 


features  such  as  skewness,  kurtosis,  and  multiplicity  of 
modes.    Such  findings  would  indicate  that  (at  best)  a 
curvilinear  rather  than  linear  relationship  would  be  required 
to  match  BIB  and  pace  9's,  thus  precluding  the  possibility  of 
using  a  joint  three-parameter  logistic  model  for  responses 
gathered  under  both  conditions.    The  plots  obtained  from  the 
three  age-sample  comparisons  are  shown  as  Figures  10.3-21 
through  10.3-23.    They  indicate  a  reasonably  good  match  of 
higher-order  features. 

(2)  Match  of  subgroup  means.    While  the  means  and  standard 
deviations  from  an  age  sample  as  a  whole  were  constrained  to 
match,  means  of  population  subgroups  were  not.    Matches  at  the 
level  of  subgroup  means  are  crucial,  however,  if  comparability 
is  to  be  claimed;  even  if  the  distributions  for  the  population 
as  a  whole  were  quite  similar,  the  finding  that  differences 
among  major  subgroups  reversed  orders  or  shifted  dramatically 
in  magnitude  would  also  preclude  the  pooling  of  BIB  and  pace 
results.    Tables  10.3(12)  through  10.3(14)  present  population 
and  subgroup  means  from  the  three  ages,  as  computed  from  BIB 
and  pace  data  after  the  first  two  population  moments  have  been 
matched.    Consistent  agreement  as  to  order  and  relative 
magnitudes  of  differences  among  major  subgroups  are  uniformly 
evident. 

(3)  Match  of  item  parameters.    The  parameter  estimates  of  items 
taken  by  any  pair  of  ages  under  pace  conditions  and  estimated 
separately  within  age-group  data,  should  be  in  essential 
agreement  after  the  scales  of  the  separate  pace  age-group 
analyses  are  translated  to  the  Year  15  BIB  scale.  Pairwise 
plots  for  the  estimates  of  b  parameters  for  items  administered 
to  Ages  9  and  13,  and  13  and  17,  appear  as  Figures  10.3-24  and 
10.3-25.    Excellent  agreement  is  shown  for  the  13  versus  17 
plot;  good  agreement  is  shown  for  the  9  versus  13  plot,  except 
that  items  that  were  difficult  for  both  ages  (high  b  values) 
tended  to  be  relatively  even  more  difficult  for  9-year-olds 
under  pace  conditions. 

Taken  together,  these  results  may  be  considered  as  justification  for 
combining  BIB  and  pace  estimates  of  the  distribution  of  a  generalized 
reading  proficiency  variable.    Defined  operationally  at  each  age  level, 
this  variable  implies  performance  on  the  NAEP  items  under  BIB 
administration  conditions  through  the  three-parameter  logistic  IRT  model 
and  the  BIB  item  parameters,  and  under  pace  condition.^  through  the  same 
model  after  the  linear  transformation  that  matches  the  first  two  moments  o 
the  appropriate  age  sample. 

After  the  translation  has  been  accomplished,  any  characteristic  of  the 
Year  15  age  populations  can  be  estimated  by  combining  the  nearly 
independent  estimates  calculated  from  the  BIB  and  pace  data.    C;,nsider,  fo 
example,  two  estimates  X    and  X    of  the  same  subgroup  mean  calculated  from 

'  BP 


^49 


BIB  and  pact  data,  with  corresponding  jackknife  variance  estimates  V  and 
V       The  combined  estimate  of  the  subgroup  mean  is 


ERIC 


X.=  W  X    +  W  X 

BP  P  P 


where 


W  =  .  y  =  P 

B  and  P 


its  estimated  variance  is 


V.  =  W^V  +  w^v 

B    B         P  P 


BP  BP 


350 


Figure  10.3-21 


NAEP  BIB/PACE  Population  Equating 

THETA  Distributions-Age  9 

0.30H 


BIB  ADMINISTRATION  PACE  ADMINISTRATION 


ERIC 


351 


Figure  10-3-22 


NAEP  BIB/PACE  Population  Equating 

THETA  Dlstrlbutlons-Age  13 


-5-4-3-2-1         0         1         2        3         4  5 

THETA 

BIB  ADMINISTRATION        -^-^r-^  PACE  ADMINISTRATION 


352 


Figure  10.3-23 


NAEP  BIB/PACE  Population  Equating 

THETA  Distributions-Age  17 


G.30H 


THETA 


BIB  ADMINISTRATION  PACE  ADMINISTRATION 


353 


ERIC 


37  i 


Table  10.3(12) 
BIB/Pace  Subgroup  Means  -  Age  9 


SUBGROUP  BIB  PACE 


TOTAL  213.3  213.5 

GENDER                                         Male  210.3  210.2 

Female  216.2  216.7 

ETHNICITY                                     \»hite  220.1  220.6 

Black  189.0  186.9 

Hispanic  193.6  i.2 

Other  222.0  ^27.0 

REGION                                         NE  217.3  217,6 

SE  207.9  205.1 

Cential  217.4  217.8 

West  210.9  213.6 

PARE^JTAL  EDUCATION                       <H.S.  195.6  204.4 

Grad  HS  211.3  212.4 

Post  HS  224.4  224.2 

IDK  207.1  205.6 

Missing  177.5  174.0 

STOC                                             Rural  205.1  205.5 

Low  Met  194.2  197.1 

High  Met  232.3  230.7 

Big  City  210.6  212.8 

Fringe  214.4  215.2 

Meu  City  213.4  212.5 

Small  Place  214.5  214.8 

ABOVE,  AT,  OR  BELOW                     <  Modal  Grade       187.4  187.6 

MODAL  GRADE  LEVEL                     =  Modal  Grade       221.7  225.5 

>  Modal  Grade       250.5  254.6 

Missing                   0.0  0.0 

ITEMS  IN  HOME                               <  3  Items  201.3  200.9 

=  3  Items  217.3  217.7 

=  4  Items  224.8  227.7 

IDK  177.8  155.6 

Missing  179.2  174.9 

TV                                                0-2  Hours  220.4  218.6 

3-5  Hours  219.5  221.7 

6-More  201.6  203.9 

Missing  205.2  192.5 


354 


ERIC 


372 


Table  10.3(13) 
BIB/Pace  Subp-oup  Means  -  Age  13 


oUDbKUUr 

RTR 

TOTAL 

258.3 

258.0 

GENDER 

Male 

253.9 

253.1 

IT  a  in  o  1  o 

262  7 

263  1 

ETHNICITY 

White 

263.7 

263.8 

Black 

237.3 

236.7 

Hispanic 

241.4 

237.4 

Other 

262.1 

262.6 

REGION 

NE 

261.2 

261.4 

SE 

257.4 

256.4 

Central 

258.8 

260.0 

West 

256.1 

254.7 

PARENTAL  EDUCATION 

<  H.S. 

241.7 

241.2 

Grad  HS 

253.3 

255.3 

ros I  no 

267  2 

IDK 

237.3 

237.4 

Missing 

255.0 

254.7 

sroc 

Rural 

255.8 

256.0 

Low  Met 

239.5 

239.4 

High  Met 

275.7 

273.4 

Big  City 

255.2 

254.8 

Fringe 

260.7 

260.0 

Med  City 

257.6 

257. 

Small  Place 

258.3 

258.' 

BELOW,  AT  OR  ABOVE 

<  Modal  Grade 

240.0 

238.7 

MODAL  GRADE 

=  Modal  Grade 

266.5 

266.9 

>  Modal  Grade 

300.6 

295.8 

Missing 

0.0 

0.0 

355 

ERIC 


Tahio  10.3(13) 
(continued) 


SUBGROUP  BIB  PACC 

ITEMS  IN  HOME  <  3  Items  240.6  243.6 

=  3  Items  256.1  255.1 

=  4  Items  266.2  265.1 

IDK  248.1  194.2 

Missing  250.4  249.2 

TV  0-2  Hours  267.1  267.5 

3-5  Honrs  262.2  262.4 

6-Mor'.  245.8  247.3 

Missing  243.7  240.2 

rOMEWORK  Had  None  255.5  256.6 

Didn't  Do  246.5  248.8 

<  1  Hour  261.1  261.1 

1-2  Hours  265.6  266.4 

>  2  Hours  263.7  262.8 

Missing  238.5  232.9 


356 


374 


Table  10.3(14) 


BJB/'Pace  Subgroup  Means  -  Age  17 


OUDVi£\v/l>ir 

DID 

TOTAL 

288.1 

288.3 

GENDER 

Male 

282.7 

284.6 

r  eniaie 

jQI  7 

OQO  1 
eye  .  1 

ETHNICITY 

White 

293.9 

295.4 

Black 

265.1 

259.8 

Hispanic 

^69.1 

268.2 

Other 

288.4 

283.0 

REGION 

NE 

289.8 

291.9 

SE 

285.3 

282.1 

Central 

289.2 

289.3 

West 

287.7 

290.0 

PARENTAL  EDUCATION 

<  H.S. 

269.8 

268.7 

Grad  HS 

280.5 

280.7 

£ Uo  L  no 

9QQ  7 

IDK 

256.8 

255.2 

Missing 

292.1 

265.3 

STOC 

Rural 

283.0 

282.0 

Low  Met 

265.8 

266.2 

High  Met 

300.4 

301.3 

Big  City 

288.2 

287.2 

Fringe 

289.8 

294.4 

Med  City 

291.0 

288.2 

Small  Place 

287.7 

288.2 

BELOW,  A-",  OR  ABOVE 

<  Modal  Grade 

260.3 

256.9 

MODAi.  GRADE  FOR  AGE 

=  Modal  Grade 

294.8 

295.0 

>  Modal  Grade 

304.3 

300.7 

Missing 

0.0 

0.0 

357 


Table  10.3(14) 
(continued) 


SUBGROUP  BIB  PACE 

ITEMS  IN  HOME  <  J  Items  266.5  267.1 

=  3  Items  282.7  284.7 

=  4  Hems  294.8  294.4 

IDK  233.5  229.4 

Missing  274.8  224.9 

TV  0-2  Houi-s  295.0  295.6 

3-5  Hours  283.7  286.1 

6-More  269.3  273.3 

Missing  255.9  253.9 

HOMEWORK  Had  None  276.8  280.4 

Didn't  Do  286.9  290.2 

<  1  Hour  288.9  289.5 

1-2  Hours  293.5  293.0 

>  2  Hours  299.*^  297.1 

Missing  245.5*  241.7 


358 


37u 


Figure  10.3-24 


COMPARISON  OF  ESTIMATED  B  VALUES 
PACE  READING   ITEMS  -  AGE  9  VS  AGE  13 

O  o  O  B-2/A>-2 
^  ^  B-2/A<-2 

AGE  9  PACE  PARAMETERS 


rS.O    -4.0    -3.0    -2.0    -1.0     0.0      1.0      2.0      3.0  4.0 

cn  I  I  I  I  !  I  I  I  I  I 


Figure  10.3-25 


COMPARISON  OF  ESTIMATED  B  VALUES 
PACE     READING   ITEMS  -  AGE   17  VS  AGE  13 


O  O  O 

«!-  >♦ 


B-2/A>-2 
B-2/A<-2 


AGE    17  PACE  PARAMETERS 


4.0 


}60 


ERIC 


Chapter  10.4 
TREND  ANALYSIS 


Robert  J.  Mislevy 
Kathleen  M.  Shoehan 

Educational  Testing  Service 


Tracking  trends  of  reading  proficiency  over  tiine^  as  well  as  offering 
comparisons  among  subpopulations  at  a  given  time^  are  major  objectives  of 
the  National  Assessment.    In  this  section,  ve  summarize  procedures  tiken 
toward  these  ends  with  NAEP  reading  data.    Attention  is  focused  on  IRT 
procedures.    Ve  describe  the  manner  in  which  data  from  past  reading 
assessments  was  selected  for  analysis,  item  calibration  procedures  (again 
under  the  three-parameter  logistic  model),  the  estimation  of  effects 
for  historically  important  NAEP  reporting  variables,  and  the  generation  of 
plausible  values.    These  procedures  were  carried  out  on  responses  solicited 
under  paced  tape  administration,  including  the  Year  15  paced  tape  bridging 
study.    The  methods  used  to  place  these  results  on  the  Year  15  BIB  reading 
proficiency  scale  were  described  in  Section  10.3.6. 


10.4.1    Estimation  of  Trends 

To  obtain  maximum  information  about  trends  in  reading  proficiency  over 
time,  separate  trend  lines  were  estimated  for  each  age  group.  Each 
wi thin-age  analysis  was  conducted  using  data  from  the  1983-84  assessment 
and  from  three  previous  assessments.  The  previous  assessments  were 
concucted  by  the  Education  Commission  of  the  States  (ECS)  during  the 
197C  71,  1974-75,  and  1979-80  school  years.    These  three  assessments  are 
referred  to  as  Years  2,  6  and  11  in  the  technical  discussions  which  follow. 
The  1983-84  assessment,  conducted  by  Educational  Testing  Service,  is 
referred  to  as  Year  15.  All  assessments  prior  to  Year  15  were  conducted 
under  paced  tape  administration  conditions.      The  Year  15  assessment  was 
conducted  under  both  paced  tape  and  BIB  spiral  conditions.    To  avoid 
confounding  pacing  effects  with  time  effects,  the  paced  tape  and  FIB  spiral 
data  collected  in  Year  15  were  analyzed  separately. 

The  original  analysis  plan  was  to  estimate  item  parameters  for  BIB  data 
and  trend  data  separately  using  LOGIST    to  link  the  scales  on  the  basis  of 
LOGIST  item  parameter  estimates,  then  obtain  maximum  likelihood  estimates 
(MLEs)  for  the  ability  of  each  sampled  pupil  with  these  item  parameter 
estimates  via  the  MLE-ABIL  program  (M.S.  Wingersky,  1984). 


361 


As  noted  in  the  preceding  chapters >  LOGIST  succeeded  in  providing  item 
parameter  estimates  from  the  BIB  data.    MLEs  were  superseded  by  plausible 
values^  however  J  due  to  problems  with  infinite  MLEs.    At  this  pointy 
plausible  values  computed  frorp  LOGIST  item  parameters  characterized  the  BIB 
data. 

As  a  precursor  to  estimating  trend  item  parameters  with  LOGIST^  an 
analysis  of  the  number  of  items  linking  assessment  booklets  in  previous 
reading  assessments  was  carried  out.    Such  links  are  LOGIST' s  basis  of 
determining  a  common  scale  across  test  forms.    The  analysis  revealed  very 
weak  links  among  the  booklets  of  a  given  age  in  a  given  year;  often  only 
three  or  lour  linking  items  were  present ^  and  their  content  was  not 
necessarily  representative  of  the  assessment  as  a  whole.    Another  source  of 
information  for  linking  that  was  available^  however ^  lies  in  the  fact  that 
with  appropriate  case  weights ^  the  samples  administered  each  b^^oklet  in  a 
given  age/year  were  randomly  equivalent  samples  from  the  same  population. 
BILOG  is  able  to  incorporate  this  linking  information^  and  was  therefore 
selected  for  item  calibration  ^n  the  trend  data.    For  reasons  discussed  in 
Section  10.3. 6>  separate  analyses  were  carried  out  for  each  age. 

Plausible  values  were  then  computed  for  trend  data  in  essentially  the 
same  manner  as  described  for  BIB  data  in  the  preceding  section.    By  the 
process  described  in  Section  10.3.6^  an  equating  of  the  1984  BIB  age 
samples  (with  LOGIST  item  parameters)  and  pace  samples  (with  BILOG  item 
parameters)  was  carried  out.    The  checks  described  in  Section  10.3.6  proved 
unsatisfactory.    It  was  hypothesized  that  the  differences  were  due  to  the 
use  of  LOGIST  item  param  t-^rs  for  one  sample  and  BILOG  for  the  other. 
After  BIB  item  parameter*     ?,re  re-estimated  with  BILOG  (Section  10.3.2)  > 
the  linking  of  BIB  and  p       did  prove  moie  satisfactory.    Plausible  values 
baseu  on  BILOG  item  para      cls  thus  provide  the  data  on  trends  that  were 
reported  in  The  Reading      >ort  Card:    Progress  Toward  Excellence  in  Our 
Schools  (1985). 


10.4.2    Selection  of  Items  and  Forms  to  Include 

To  maximize  year-to-year  linkages  while  minimizing  the  total  number  of 
items  calibrated^  only  those  booklets  which  contained  relatively  high 
proportions  of  items  in  common  with  other  assessment  years  were  included  in 
r.he  trend  analysis.    Table  10.4(1)  lists  the  booklets  selected  for  each 
grade/age.    For  Year  2^  the  eight  booklets  which  provided  the  most  linking 
items  were  selected.    Since  only  three  booklets  vere  administered  in  Year 
6,  all  three  were  selected.    For  Year  11,  only  those  booklets  which 
contained  five  or  more  items  in  common  with  the  Year  15  data  were  selected. 
All  of  the  booklets  administered  in  pace  format  in  Year  15  were  selected. 
These  booklets  provided  a  total  of  633  distinct  verbal  items:    496  reading 
items  and  137  study  skills  items.    The  reading  items  were  used  in  the 
IRT-based  trend  analysis  which  is  documented  here.    Throughout  this 
chapter,  items  are  identified  by  a  three  digit  number  which  indicates  their 
position  on  the  trend  data  file.    Table  B-1  in  Appendix  B  provides 
additional  identification  information  for  each  item,  including  both  the  ECS 
ID  number  and  the  ETS  ID  number,  where  appropriate. 


362 


360 


Items  which  had  been  administered  by  both  ECS  and  FTS  werp  examined  to 
ensure  equivalence  across  administrations.  This  investigation  revealed  33 
reading  items  whirh  had  undergone  significant  changes  in  format  between  the 
ECS  and  the  ETS  administrations.  These  33  items  were  treated  as  new  items 
in  Year  15.  Table  10.4(2)  lists  these  33  "changed"  items  along  with  their 
old  and  new  item  numbers. 

The  addition  of  33  "new"  reading  items  to  the  trend  data  yielded  a 
total  of  529  effective  items  for  the  IRT  trend  analysis.    Of  this  total, 
217  were  administered  to  9-year-olds,  211  were  administered  to 
13-year-olds,  and  205  were  administered  to  17-year-olds.    These  data  were 
screened  for  anomalous  patterns  in  proportion-correct  across  administration 
years.    Table  10.4(3)  lists  proportion-correct  data  for  eleven  items  which 
were  flagged  by  the  screening  procedure.    Several  of  these  items  were 
eventually  exc  uded  from  the  analysis.    Excluded  items  are  noted  in  the 
table  by  an  N  (Not  Calibrated).    Items  retained  in  the  analysis  are  noted 
by  a  C  (Calibrated).    Supportir.g  evidence  fcr  the  decision  to  include  or 
exclude  each  "questionable"  item  follows. 

Three  of  the  items  listed  in  Table  10.4(3)  (#119,  #145,  and  #368)  were 
flagged  becaise  they  showed  a  sudden  drop  in  proportion-correct  in  Year  15. 
Further  investigation  of  these  items  revealed  that  all  three  contained 
significant  form?t  changes  which  lad  been  previously  overlooked.  These 
three  items  were  excluded  as  noted.      Five  items  which  were  not 
administered  in  Year  15  showed  unexplained  whifts  in  the  proportion  of 
correct  responses  in  Years  2,  6,  and  11.    These  items  (#3,  #142,  #150, 
#153,  and  #175)  were  also  excluded.     (It  is  useful  to  note  that  Item  #3  was 
the  only  item  excluded  from  the  analysis  of  the  BIB  data.    In  the  BIB  data 
file,  this  item  was  coded  as  Item  #20.)      Item  #105  was  excluded  from  the 
Age  17  analysis  because  it  was  too  easy  to  provide  any  information  about 
the  population  distribution;  in  the  calibration  sample  of  at  least  1,000 
subjects  responding  to  the  item,  its  proportion-correct  value  was  1,0. 
Item  #251  was  excluded  from  the  Age  9  analysis  for  the  same  reason.  Item 
#179  was  flagged  because  the  calculated  proportion-correct  for  Age  9  was 
higher  than  that  for  Age  13.    This  item  was  found  to  have  been  mis-keyed 
and  was  excluded  from  all  trend  analyses.  In  total,  five  items  were 
excluded  from  the  Age  9  trend  data,  seven  items  were  excluded  from  the  Age 
13  trend  data  and  three  items  were  excluded  from  the  Age  17  trend  data. 


10.4.3    Estimation  of  Trend  Item  Parameters 

The  BILOG  computer  program  was  used  to  estimate  item  parameters  for  the 
trend  data.    Separate  calibrations  were  performed  for  each  age  group.  A 
random  sample  of  examinees  was  selected  for  each  calibration.    The  sampling 
frame  consisted  of  all  examinee.';  who  were  administered  the  booklets  listed 
in  Table  10.4(1).    The  sample  selected  for  each  age  group  provided 
approximately  1,000  examinees  ior  each  item  for  each  year."  The  modified 
BILOG  priors  described  in  Seclion  10.3.2  were  found  to  be  appropriate  for 
all  three  age  groups.    Each  "Age-Only"  calibration  required  approximately 
twelve  EM  cycles  and  one  Newton  step. 


363 


Table  10.4(1) 
Booklets  Selected  for  Calibrating  Trend  Items 


Age  Year  Booklets 

9  2  1.2,3,4,5,6,7,9 

6  1,2,3 

11  1,2,3,4,10,11 

15  1,2,3,4 

Age  Year  Booklets 

13  2  1,2,3,4,5,11,12,13 

6  1.2,2 

11  1,2,3,14 

15  1,2,3,4 

Age  Year  Booklets 

17  2  2,3,4,5,7,8,9,10 

6  1,2,3 

11  1,2,3,11 

15  1,2,3,4 


ERIC 


364 


Table  10.A(2) 
"Changed"  Reading  Items 


New 

Old 

Item 

T  f  Pm 

No. 

ECS  ID 

ETS  ID 

Nn 

634 

H284000-A002/002 

N003202 

635 

H284000-A003/003 

N003203 

*f  *f 

636 

7102010-AOOl/OOl 

N005101 

92 

637 

7099007-AOOl/OOl 

N005001 

12 

638 

7103044-A002/002 

NO01702 

X  H  c> 

639 

7103044-A003/003 

N001703 

X  H  J 

640 

H265000-A003/003 

N002003 

H  J  J 

641 

7503045.-A001/001 

N003301 

J\J\J 

642 

7303019-A002/002 

N001202 

643 

7401016-AOOl/OOl 

NOCiSOl 

3^2 

644 

7103020-AOOl/OOl 

N003901 

117 

XX/ 

645 

H413000-A001/001 

N008201 

7** 

646 

H413000-A005/005 

N0082O5 

647 

H222000-A004/004 

N001603 

A07 

648 

7127001-A002/002 

N003802 

170 

649 

7127001-A003/003 

N003803 

ft  Y  V  V  ^  V/  V  ^ 

171 

650 

7099006-AOOl/OOl 

N004201 

10 

X  V 

651 

7099006-A002/002 

N004202 

11 

X  X 

652 

7127003-A002/002 

N002102 

175 

653 

H442000-A001/001 

N008108 

519 

654 

H241C00-A002/002 

N004402 

418 

655 

H241000-A003/003 

N004403 

419 

656 

H404000-A004/004 

N013104 

472 

657 

H416000-A002/002 

N010002 

499 

658 

7503001-AOOl/OOl 

NOlllOl 

379 

659 

iJ205000-A001/001 

N010501 

398 

660 

/102008-AOOl/OOl 

N010301 

91 

661 

H201000-A002/002 

N008602 

388 

662 

H405000-A001/001 

N001501 

473 

663 

H405000-A002/002 

N001502 

474 

664 

H405000-A004/004 

N0015O4 

476 

665 

7401071-AOOl/OOl 

N010201 

355 

666 

H287000-A001/001 

N013501 

451 

365 


ERIC 


Table  10.4(3) 
Proportion  Correct  for  Questionable  Items 


Item^ 

Age^ 

Year  2 

Year  6 

Year  11 

Year  15 

Status^ 

3 

9 

0.1863 

0.2095 

0.1642 

N 

13 

0.1801 

0.1273 

0.1233 

N 

105 

17 

1.0000 

N 

1  1  Q 

Q 

7 

U .  oD*»D 

U»oZoO 

A  CO/:  1 

M 

13 

0.9353 

0.9068 

0.9165 

c 

1 1 

U. jloo 

U .  j*»Uo 

17 

0-A799 

0.4611 

0.4533 

C 

145 

13 

0.5847 

0.5904 

0-4794 

N 

1  / 

A    T  0  0  C 

U . /AAA 

U.6096 

N 

150 

9 

0-6962 

0-6594 

0.7827 

C 

1 J 

O^l  1 

U. 9311 

Q0 1 

U .  9iJlU 

N 

1  ^1 

ID  J 

1 1 
1 J 

U . Vbyj 

U.Uojl 

N 

17 

0.2349 

0.1580 

0.1329 

C 

175 

9 

0.1177 

0.1538 

0.1158 

C 

13 

0.5559 

0.4383 

0.4416 

N 

17 

0.5812 

0.7061 

0.5724 

N 

179 

9 

0.1288 

N 

13 

0.0451 

N 

251 

9 

1.0000 

N 

368 

9 

0.6153 

0.5972 

0.7135 

0.4990 

N 

The  item  number  is  the  position  of  the  item  on  the  trend  data  file. 

^Only  age  groups  to  which  the  item  was  administered  are  listed. 

^Items  marked  N  were  administered  but  not  calibrated;  items  marked  C  were 
administered  and  calibrated. 


ERIC 


366 


384 


Diagnostic  plots  were  produced  after  every  fourth  cycle.    A  sample 
diagnostic  plot  is  given  in  Figure  lO.A-1.    As  in  the  figures  presented 
earlier,  the  smooth  line  is  the  fitted  three-parameter  logistic  item 
response  curve  and  the  points  represent  expected  proportions  of  correct 
responses  for  various  subgroups  of  examinees.    In  these  particular  plots, 
examinees  are  classified  according  to  the  calendar  year  in  which  they  were 
tested. 

These  plots  revealed  six  poorly  fitting  items  which  were  later  excluded 
from  the  trend  analysis.    Table  10.4(4)  identifies  each  item  and  indicates 
which  particular  "Age-Only"  analysis  was  affected.    Diagnostic  plots  for 
these  six  items  are  given  in  Figure  10.4-2. 

The  number  of  items  included  in  the  final  trend  calibrations  is 
provided  in  Table  10.4(5).    The  final  item  parameter  estimates  and 
corresponding  standard  errors  are  given  in  Tables  B-2  through  B-4  in 
Appendix  B.      These  parameter  estimates  were  originally  estimated  on  a 
provisional  scale  but  were  re-scaled  so  that,  for  each  age  group,  the 
distribution  of  reading  ability  in  the  Year  15  pace  sample  would  have  the 
same  first  two  moments  as  the  distribution  of  reading  ability  in  the  BIB 
sample.    Ability  distributions  were  estimated  using  RESOLVE  (Mislevy, 
1985c).    The  trend  item  parameters  were  re-scaled  in  this  manner  so  that 
the  results  of  the  trend  analysis  could  be  reported  on  the  Reading 
Proficiency  Scale.    (Additional  details  on  the  Year  15  BIB/pace  linkage  are 
provided  in  Section  10.3.5). 

Tables  B-5  through  B-7  in  Appendix  B  provide  item  linkage  information 
for  all  of  the  items  included  in  the  final  BILOG  calibrations.  This 
information  includes; 

(1)  the  total  number  of  items  calibrated  in  each  booklet; 

(2)  the  number  of  calibrated  items  linking  booklets  across  years; 
and 

(3)  the  number  of  calibrated  items  linking  booklets  within  years. 

The  position  cf  each  item  within  its  test  booklet  is  also  provided. 
(These  numbers  appear  in  the  columns  of  the  tables.) 


10.4.4    Estimation  of  Conditional  Effects 

Conditional  distributions  of  reading  proficiency  given  background 
responses  were  estimated  separately  for  each  age  group  and  for  each 
assessment  year.    Background  variables  were  chosen  to  be  as  similar  as 
possible  to  the  background  variables  used  in  the  analysis  of  the  Year  15 
BIB  data. 

One  change  that  could  not  be  avoided  involved  the  definition  of 
examinee  ethnicity.    In  the  Year  15  assessment,  information  about  examinee 
ethnicity  was  available  from  a  variety  of  different  sources.  This 
information  was  combined  to  form  a  derived  variable,  labeled  "imputed 
race/ethnicity,"  which  was  used  in  the  analysis  of  the  Year  15  BIB  data. 


367 


Figure  10.4-1 

Diagnostic  Plot  for  Item  87 
(Calibrated  for  Examinees  ^n  Grade  4/ Age  9) 


ECS  ID  =  71020004.A001/001;  ETS  ID  =  N014101 
V  =  Year  2       O  =  Year  6       X  =  Year  11 


0 

Theta 


368 


Figure  10.4-2 

Plots  of  Items  Excluded  During  Preliminary  Calibrations  of  Trend  Data 


V=Year2       0  =  Year6       X  =  Year  11 


369 


ERIC 


3t<7 


Figure  lO.A-2 
(continued) 


V  =  Year  2       O  =  Year  6       X  =  Year  11 


Item  51 


c 
o 


o 

9 


o 

o 


c 
o 

r 
o 
a 
o 

w 


Item  291 


0> 

o> 
c 
o 
a 

o 


o 
o 

u 
u 

o 

o 

o 
c 
o 

s 

o. 


Theta 


370 


FRir 


388 


Figure  10. A- 2 
(continued) 


V  =  Year  2       O  =  Year  6       X  =  Year  11 


Item  117 


o 
c 
o 

r 
o 


0  I  I  1  j  1  1  — 

-4  0  4 

Theta 


371 


Table  10.4(4) 
Items  Excluded  During  Preliminary  Calibration  Runs* 


Grade  4/         Grade  8/  Grade  11/ 


Item  Age  9  Age  13  Age  17 


51  C  N 

100  N 

117  N 

205  C                     N  C 

291  c  N 

292  c  N 


*C  =  administered  and  calibrated;  N  =  administered  but 
not  calibrated. 


Table  10.4(5) 
Item  Calibra'^ion  Summary 


Number 

Number  of  Number  Excluded  Number 

Potential  Excluded  After  Included 

Trend  During  Initial  In  Final 

Items  Screening  Cali'uration  Calibration 

Grade  4/Age  9       217  5  1  211 

Grade  8/Age  13      211  7  2  202 

Grade  11/Age  17    205  3  3  199 


ERIC 


372 


(The  exact  definition  of  the  "imputed  race/ethnicity"  variable  can  be  found 
in  Section  12.1.)    However,  because  the  only  type  of  ethnicity  information 
available  from  previous  NAEP  assessments  was  observed  ethnicity,  the 
conditioning  variable  used  for  the  trend  analysis  was  "observed  ethnicity" 
rather  than  "imputed  race/ethnicity." 

The  variable  coding  scheme  developed  for  the  data  collected  in  Years  6 
and  11,  and  the  Year  15  pace  data,  mirrored  the  scheme  developed  for  the 
Year  15  BIB  data  with  one  exception:     the  grade/age  variable  was  coded  with 
three  levels  rather  than  five.    These  three  levels  were  defined  as: 


Level  Description 

1  at  Modal  age,  <  Modal  grade 

2  at  Modal  age,  =  Modal  grade 

3  at  Modal  age,  >  Modal  grade 


This  same  three-level  grade/age  variable  was  included  in  the  coding 
scheme  developed  for  the  Year  2  data.  Two  additional  changes  were  also 
incorporated  into  the  coding  scheme  developed  for  the  Year  2  data: 

(1)    The  ethnicity  effect  was  coded  with  two  levels  rather  than 
three,  because  in  Year  2,  Hispanics  were  not  coded  as  a 
separate  ethnic  group.    The  two  coded  levels  for  ethnicity 
were 

Level  Description 

1  Black 

2  White  and  Other  (Including  missing) 


(2)    The  Region  effect  was  excluded,  because  the  Region  variable 
was  incorrectly  coded  in  Year  2. 

The  dataset  used  to  estimate  conditional  effects  included  all  examinees 
who  were  administered  a  booklet  containing  at  least  two  calibrated  items. 
This  dataset  included  some  examinees  who  vere  administered  booklets  which 
were  not  used  for  item  calibration  but  which  did  include  two  or  more  items 
which  also  appeared  in  booklets  which  were  used  for  item  calibration. 
These  additional  booklets  are  listed  in  Table  10.4(6).    Examinees  who  did 
not  respond  to  any  of  the  calibrated  items  were  included  on  the  file  but 
were  not  used  by  the  estimation  procedure.    The  estimated  effects,  and 
sample  sizes,  are  given  in  Tables  10.4(7)  through  10.4(10). 


373 


10.4.5    Generation  of  Plausible  Values 


Five  plausible  values  were  generated  for  each  examinee  who  was 
administered  at  least  one  of  the  booklets  listed  in  Tables  10.4(1)  and 
10.4(6).    The  methodology  used  to  generate  tne  plausible  values  exactly 
parallels  the  methodology  which  was  used  to  generate  plausible  values  for 
the  Year  15  BIB  data.  This  methodology  is  described  in  Section  10.3.4.  The 
file  of  plausible  values  produced  was  used  to  estimate  the  trend  lines 
which  were  reported  in  The  Reading  Report  Card  (1985). 


374 


Table  10.A(6) 

Additional  Booklets*  Used  For  Estimating  Conditional  Distributions 


Age  Year  Booklets 

9  2  8 

11  6,8 

Age  Year  Booklets 

13  2  6,8 

11  A,6,13,15 

Age  Year  Booklets 

17  2  1,6 

11  A,13,1A 


♦These  booklets  were  not  used  to  estimate  item  parameters.  However 
each  contained  two  or  more  items  which  also  appeared  in  a  booklet 
which  was  used  to  estimate  item  parameters. 


375 


Table  10.A(7) 


Estimated  Conditional  Effects 
Year  2  Pace  Data 


Effect 

Level 

Age  9 

Age  13 

Age  17 

Intercept 

All  subjects 

-2.570618 

-1.182882 

-1.087587 

Sex 

Female 

0.205023 

0.201680 

0. 171321 

Ethnicity 

White  and  Other 

0.708416 

0.638349 

0.696148 

STOC 

High  Metro 

Nnf  Hiffh  or  Lo  Metro 

0.460787 
0.201538 

0.088009 
-0.071709 

0.314084 
0.127357 

Parental  Ed. 

High  School 
Beyond  HS 
All  else 

0.282657 
0.489658 
0.108501 

0.232633 
0.437952 
-0.03C201 

0.227909 
0.484824 
0.06542/ 

Grade/Age 

>:  M  age,  *  H  grade 
s  M  age,  ^  M  grade 

0.702134 
1.137812 

0.581215 
0.904846 

0.740842 
0.911997 

Misc. 

Subjects  with 
unrecoverable 
missing  values. 

0.448089 

-0.257232 

-1.898172 

Age  9 

Age  13 

Age  17 

Number  of  Examinees 

18,096 

23,938 

18,417 

Estimated  Variances 

0.4631 

0.30544 

0.45796 

376 

ERIC 


Table  10.A(8) 


Estimated  Conditional  Effects 
Year  6  Pace  Data 


Effect 

Level 

Age  9 

Age  13 

Age  17 

Intercept 

All  subjects 

-2.334113 

-1  295558 

Sex 

Female 

0.195583 

0.221243 

0.154221 

Ethnicity 

White  and  Other 
Hispanic 

0.539376 
0.112853 

n  595777 
0.281520 

0.374550 

STOC 

High  Metro 

Not  High  cr  Lo  Metro 

0. AA0385 
0.278599 

0. 355996 
0.216792 

0.173098 

Region 

Central 
South  East 
Vest 

-0.135823 
0.067479 
-0.069881 

-0.041520 
0.083684 
-0.043541 

-0.105375 
0.008417 
-0.106991 

Parental  Ed. 

High  School 
Beyond  HS 
All  else 

0.27J121. 
0.413783 
0.1248?; 

0.189764 
0.427720 
-0.023117 

0.166982 
0.433708 
-0.201775 

Grade/Age 

=  N  affe.  =  M  ffrade 
«  M  age,  >  M  grade 

0.625358 
1.035986 

0.512917 
0.833832 

0  649586 
0.810016 

Misc. 

Subjects  with 
unrecoverable 
missing  values. 

1.223144 

-0.159365 

0.838545 

Age  9 

Age  13 

Age  17 

Number  of  Examinees 

21,697 

21,393 

19,624 

Estimated  Variances 

0.39646 

0.32585 

0.38421 

377 


Table  10.A(9) 


Estimated  Conditional  Effects 
Year  11  Pace  Datr 


Effect 

Level 

Age  9 

Age  13 

Age  17 

Intercept 

All  subjects 

-2.114314 

-1.032202 

-0.867096 

Sex 

Female 

0.159384 

0.145401 

0.114560 

Ethnicity 

White  and  Other 
Hispanic 

0.451578 
0.081409 

0.491249 
0.2i6617 

0.611792 
0.349791 

STOC 

High  Metro 

Not  High  or  Lo  Metro 

0.481785 
0.270451 

0.252162 
0.066009 

0.277156 
0.168417 

Region 

Central 
South  East 
Vest 

-0.098071 
-0.005778 
-0.088577 

-0.057445 
0.083751 
-0.067965 

-0.046963 
-0.011844 
-0.042691 

Parental  £d. 

High  School 
Beyond  HS 
All  else 

0.228851 
0,409781 
0.100306 

0.215626 
0.490029 
-0.076120 

0.140464 
0.439863 
-0.102193 

Grade/Age 

=  M  age,  =  M  grade 
=  M  age,  >  M  grade 

0.615816 
1.187973 

0.433504 
0.777249 

0.616226 
0.792824 

Misc. 

Subjects  vit-h 
unrecoverable 
missing  values. 

0.740209 

0  917640 

0.246829 

Age  9 

Age  13 

Age  17 

Number  of  Examinees 

21,158 

22,321 

18,099 

Estimated  Variances 

0.39041 

0.33354 

0.35543 

378 


Table  10,4(10) 


Estimated  Conditional  Effects 
Year  15  Pace  Data 


Effect 

Level 

Age  9 

Age  13 

Age  17 

Intercept 

All  subjects 

-1.850093 

-0.908307 

-0.537325 

Sex 

Female 

0.098570 

0.140171 

0.119340 

Ethnicity 

Vhite  and  Other 
Hispanic 

0.518862 
0.026952 

0.425534 
0.076946 

0.493804 
0.287789 

STOC 

High  Metro 

Not  High  or  Lo  Metro 

0.373506 
0.150687 

0.279597 
0.096120 

0.238584 
0.174528 

Region 

Central 
South  East 
Vest 

-0.143570 
0.000213 
-0.037804 

0.033821 
0.013760 
-0.021991 

-0.074830 
-0.037671 
-0.016431 

Parental  Ed. 

High  School 
Beyond  HS 
All  else 

0.040130 
0.240621 
-0.029917 

0.161891 
0.328296 
0.013953 

0.123633 
0.416555 
-0.146127 

Grade/Age 

=  M  age,  =  M  grade 
=  M  age,  >  M  grade 

0.G52164 
1.328896 

0.473655 
0.915722 

0.542676 
0.665729 

Misc. 

Subjects  with 

1.159244 

1.204793 

1.443256 

unrecoverable 
missing  values. 


Age  9  Age  13  Age  17 

Number  of  Examinees  5,492  5,158  6,209 

Estimated  Variances  0.41479         0.33445  0.41769 


379 


Chapter  10.5 
THE  NAEP  READING  SCALE 


Albert  E.  Beaton 
Educational  Testing  Service 


Since  its  inception,  a  major  goal  of  the  National  Assessment  of 
Educational  Progress  (NAEP)  has  been  to  report  to  decision  makers  at  all 
levels  what  youths  can  and  cannot  do.    To  be  useful-  its  reports  should  be 
psychometrically  sound,  yet  easily  interpretable;  reports  should  be  clear 
and  concise,  yet  should  not  miss  important  subtleties  of  the  learning  area 
being  assessed.    The  essential  conflict  between  simplicity  and  detail 
requires  careful  thought,  and  decisions  must  be  made  about  what  information 
is  most  useful  and  what  information  can  be  judiciously  excluded. 

The  NAEP  staff  has  carefully  considered  how  NAEP  results  would  best  be 
presented.  The  dimensionality  of  the  Year  15  reading  data  was  examined  and 
it  was  found  that  much  of  the  reading  information  could  be  summarized  using 
a  single  dimension.    Item  response  theoretic  (IRT)  methods  were  used  as  a 
way  of  estimating  the. item  parameters  for  that  reading  dimension.  Using 
the  item  parameters,  sampling  information,  and  the  available  information 
about  individual  students,  estimates  of  the  reading  proficiency  of  American 
youth  were  made.    After  equating  for  differences  in  methods  of 
administering  items,  reading  data  from  the  Year  2  (1970-71),  Year  6 
(1974-75),  and  Year  11  (1979-80)  assessments  were  also  scaled  and 
population  estimates  were  computed. 

The  purpose  of  this  section  is  to  describe  the  way  that  the  NAEP 
reading  results  were  presented.    The  next  section  will  discuss  the  NAEP 
reading  proficiency  scale,  which  can  be  thought  of  as  estimated  true  scores 
on  a  hypothetical  test  with  known  properties.    After  the  scale  is 
presented,  we  will  discuss  the  anchoring  of  several  scale  points  to  specify 
what  students  at  those  points  can  and  cannot  do. 


10.5.1    The  NAEP  Reading  Scale 

In  its  earlier  years,  NAEP  reported  educational  progress  by  present:  ng 
the  estimated  percentage  of  students  who  responded  correctly  to  each 
exercise.    The  percentages  passing  were  also  presented  for  selected 
subpopulations  such  as  the  different  sexes,  racial/ethnic  groupings,  and 
regions  of  the  country.    This  approach  allows  a  very  detailed 
interpretation  of  what  students  can  and  cannot  do.    Insofar  as  the  actual 
text  of  the  exercises  was  made  publicly  available,  a  reviewer  or  policy 


381 


maker  could  look  at  each  exercise  and,  using  its  percent  passing,  judge  the 
adequacy  of  student  performance. 

This  approach  soon  proved  to  be  cumbersome  because  too  much  detailed 
information  was  available  for  most  policy  makers  to  integrate  and 
interpret.    Some  method  of  summarizing  the  information  was  clearly 
necessary.    The  past  solution  to  the  over-abundance  of  information  was  to 
publish  the  average  of  the  percents  correct  over  all  exercises  in  a  subject 
area  such  as  reading.    To  avoid  omitting  too  much  detail,  the  average 
percent  correct  was  also  presented  for  sub-areas  of  interest;  for  example, 
in  the  reading  assessment,  the  average  percents  correct  were  presented 
separately  for  literal  comprehension,  inferential  comprehension,  and 
reference  skill  exercises. 

If  all  of  che  exercises  had  been  administered  to  all  of  the  students, 
then  the  average  percent  correct  over  all  exercises  would  have  been  the 
same  as  the  average  percent  correct  over  all  students;  that  is,  we  could 
have  reached  the  same  value  by  computing  the  percent  correct  for  each 
student  and  averaging  over  all  students.    The  average  percent  correct  could 
thus  be  considered  as  the  average  of  the  students'  scores  on  a  percent 
correct  scale.    However,  it  should  be  noted  that  the  matrix  sampling 
methods  used  in  past  and  present  NAEPs  have  the  effect  that  all  students  do 
not  receive  the  same  exercises,  so  the  average  percent  correct  statistic  is 
not  precisely  the  same  as  averaging  individual  scores. 

The  average  percent  correct  metric  makes  it  awkward  to  report  what 
students  can  and  cannot  do.    First,  the  average  percent  correct  metric 
depends  on  the  selection  of  exercises;  the  selection  of  easy  or  difficult 
exercises  could  make  student  performance  look  good  or  bad,  especially  to  a 
public  that  is  accustomed  to  a  "passing  score"  of  70  percent,  for  example. 
Second,  since  the  metric  is  dependent  on  the  selection  of  items,  the  items 
cannot  be  changed  over  time;  items  cannot  be  retired  and  replaced  without 
changing  the  metric.    This  also  restricts  the  ability  to  release  exercises 
to  the  public.    Third,  age-to-age  and  grade-to-grade  comparisons  require 
that  the  same  exercises  be  administered  at  all  age  or  grade  levels. 
Finally,  even  if  ail  exercises  were  administered  to  all  students,  the 
average  percent  correct  would  not  indicate  what  they  could  or  could  not  do 
without  examination  of  the  individual  exercise  information. 

Besides  the  percent  correct  metric,  we  considered  and  rejected  a  number 
of  other  reporting  metrics.    We  did  not  want  to  present  performance  in  a 
norm-referenced  metric  since  the  question  was  what  students  can  and  cannot 
do,  not  how  they  compare  to  each  other  or  some  norm  group;  thus, 
percentiles  and  grade  equivalents  were  rejected.    We  did  not  want  the 
metric  easily  confused  with  well  known  scales  such  as  the  SAT  or  ACT 
scales.  And,  of  course,  any  scale  that  might  be  confused  with  IQ  might 
mislead  the  casual  reader  about  the  assessment's  meaning. 

A  seemingly  simple  way  to  proceed  would  have  been  to  use  the  metric 
which  is  implicit  in  the  IRT  scaling  procedures  that  were  used.    The  LOGIST 
program  produces  a  value  called  theta  for  each  subject,  and  this  value  is 
an  estimate  of  the  subject's  proficiency  on  the  dimension  being  measured. 


382 


The  BILOG  procedure  produces  a  distribution  of  plausible  values  in  the  same 
metiic.  Typically,  the  values  of  theta  are  standardized  so  that  the  average 
over  all  subjects  is  zero  and  the  standard  deviation  is  unity  .  However, 
the  theta  scale  results  in  negative  scores  which  are  more  difficult  for  the 
public  to  interpret — and  this  might  unduly  affect  a  subgroup  which  received 
an  average  score  below  zero.  Also,  the  theta  scale  is  unbounHed,  with 
possible  values  anywhere  from  minus  to  plus  infinity. 

The  LOGIST  and  BILOG  programs  can  also  produce  an  alternative  score 
called  the  xi  score.    The  xi  score  is  the  estimated  true  score  on  a  test. 
For  the  Year  15  reading  assessment,  228  scaled  exercises  were  administered 
at  one  or  more  grades  or  ages.    Thus,  the  xi  scores  would  range  from  A8.7, 
chance  level,  to  228,  the  number  of  exercises.    An  advantage  of  the  xi 
score  is  that  it  makes  finite  estimates  possible  for  all  subjects;  those 
who  respond  to  all  exercises  correctly  are  estimated  to  have  perfect  true 
scores  on  the  test  and  those  who  did  not  do  as  well  as  chance  are  estimated 
at  the  chance  level.  Also,  since  the  xi  scores  are  like  test  scores,  they 
are  in  a  familiar  type  of  metric. 

However,  using  the  xi  scale  would  enshrine  the  particular  reading 
assessment  of  Year  15  as  the  standard  for  all  past  and  future  reading 
assessments.    These  exercises  were  selected  from  a  pool  given  by  ECS,  the 
previous  grantee,  to  ETS,  the  present  grantee.    This  set  of  exercises  was 
not  selected  with  any  particular  metric  in  mind;  in  fact,  the  set  as  a 
whole  was  relatively  easy.    The  item  parameters  suggested  unequal  test 
information  at  different  levels  of  the  scale.    Thus,  reporting  results  as 
estimated  scores  on  this  particular  assessment  battery  was  rejected. 

Instead  of  using  the  xi  scale  of  the  actual  assessment  battery,  we 
chose  to  report  the  reading  results  as  the  estimated  true  score  on  a 
hypothetical  reading  proficiency  test  with  somewhat  idealized  properties. 
The  properties  are  as  follows: 

(1)  The  hypothetical  test  consists  of  500  items.    This  property 
has  the  effect  that  test  scores  can  range  between  zero  and 
500. 

(2)  All  item  characteristic  curves  are  logistic,  i.e.  have  the 
general  form 


p^.  =  1/(1  +  e-'-'^^^^  -  ^ 

where  p^.  is  the  probability  that  a  subject  responds  correctly 
to  item  i,  a  is  the  discrimination  parameter,  b.  is  the 
difficulty  parameter,  and  9    is  the  true  proficiency  score  for 
subject  s  in  the  theta  metric. 


^LOGIST  actually  standardizes  such  that  the  standard  deviation  of  the 
scores  between  -3  and  +3  is  unity. 


383 


(3)  The  correct  answers  to  items  cannot  be  achieved  by  guessing. 

(4)  All  items  discriminate  equally;  that  is,  a  =  1.5  for  all 
items.    The  value  a  =  1.5  was  chosen  since  it  is  approximately 
the  average  value  of  the  discrimination  parameters  for  the 
actual  items  used  in  NAEP. 

(5)  Item  difficulties  are  evenly  distributed  along  the  theta 
scale;  that  is,  the  b.  vary  from  -4.99  to  +4.99  in  steps  of 
.02.    Since  almost  all  subjects  will  typically  score  in  the 
range  of  -3.0  to  +3.0,  this  condition  means  that  the 
hypothetical  test  has  about  100  items  so  easy  that  almost 
everyone  responds  correctly  and  about  100  items  so  difficult 
that  they  are  failed  by  almost  everyone. 

Both  Lord  and  Mislevy  have  shown  that  a  scale  defined  in  this  way  is 
essentially  a  linear  function  of  the  theta  scale  within  the  range  of  actual 
data.    Holland  and  Zwick  (1986)  have  provided  a  general  function  relating  ^ 
the  theta  scale  to  such  hypothetical  test  scores.    The  particular  function 
used  to  translate  from  the  theta  scale  to  the  reading  proficiency  (RP) 
scale  was 


RP     =  250.5  +  50  (thetaj 

s  s 

where  RP^  is  the  score  of  subject  s  on  the  reading  proficiency  scale. 

Using  this  definition  of  a  hypothetical  test  and  since  the  distribution 
of  theta  has  a  zero  mean  and  unit  variance,  we  can  make  the  following 
statements  about  the  distribution  of  reading  proficiency  scores: 

(1)  The  mean  reading  proficiency  is  250.5  over  all  ages  and  grades 
combined. 

(2)  The  standard  deviation  of  proficiency  scores  is  50. 

The  NAEP  RP  scores  ranged  between  about  75  and  425.  The  distribution  of 
reading  proficiency  scores  in  NAEP  is  not  normal,  since  three  distinct  ages 
and  grades  are  included  in  the  distribution.    The  overall  distribution  has 
three  major  modes,  one  for  each  grade/age  combination,  and  there  is 


^  The  function  RP    =  250  +  50(thetag)  would  have  been  preferable. 
Holland  and  Zwick  (1986)  have  noted  that  the  values  actually  used 
correspond  to  the  b^  varying  from  -5.00  to  +4.98  in  steps  of  .02  instead  of 
-4.99  to  +4.99  as  intended.  The  result  is  that  the  RP  scores  are  a 
half-point  higher  than  appropriate  for  the  hypothetical  test. 


384 


401 


considerable  overlap  among  the  distributions.    Thus,  the  overall  mean  and 
standard  deviation  over  all  three  grade/age  combinations  has  little 
interpretive  value. 

Clearly,  the  NAEP  scale  is  not  norm-referenced  in  the  sense  that 
knowing  an  individual's  score  by  Itself  gives  any  useful  information  about 
how  he  or  she  compares  to  other  individuals.    The  distribution  of  theta  is 
used  only  to  assure  that  the  available  exercises  span  the  range,  and  a 
little  bit  more,  where  we  expect  students  to  be. 

The  hypothetical  test  would  be  appropriately  fit  by  the  Rasch  model 
since  there  is  no  guessing  and  the  item  discrimination  parameters  are  all 
identical. 

The  scale  of  the  hypothetical  test  is  equal  interval;  that  is,  if  we 
constructed  such  a  test  then  a  subject  who  scored  five  points  higher  than 
another  would  be  expected  to  answer  five  more  items  than  the  other  no 
matter  where  on  the  scale  the  two  subjects  scored.    In  fact,  the  scale  is, 
in  a  sense,  a  ratio  scale  since  an  estimated  score  of  zero  means  that  the 
subject  would  answer  no  items  correctly;  however,  the  zero  is  arbitrarily 
determined  by  the  specified  range  of  the  difficulty  parameters  and  a  zero 
score  does  not  mean  that  the  subject  has  no  reading  proficiency  at  all. 

Lord  has  noted  that  no  one  would  or  should  build  a  test  according  to 
these  specifications;  having  so  many  easy  and  difficult  items  would  be 
inefficient.    Also,  a  test  for  a  particular  purpose  should  have  its  items 
clustered  near  important  decision  points.    However,  we  are  using  this 
hypothetical  test  for  reporting  purposes  only  and  do  not  intend  to  attempt 
to  construct  a  test  with  such  properties. 


10.5.2    Anchoring  Scale  Points 

In  this  section  we  address  the  issue  of  presenting  what  students  can 
and  cannot  do  in  reading.    Our  approach  is  to  select  a  few  points  on  the 
scale,  find  exercises  that  discriminate  between  what  students  at  each  point 
can  do  that  students  at  lower  levels  cannot,  and  then  attempt  to  generalize 
from  the  exercises  to  classes  of  competency. 

If  the  reading  items  formed  a  perfect  scale  in  the  Guttman  (1941) 
sense,  then  a  person's  test  score  would  indicate  exactly  which  items  that 
person  could  answer  correctly  and  which  he  or  she  could  not.    A  score  of 
275,  say,  would  indicate  that  the  subject  could  answer  the  275th  item  and 
all  easier  items  but  could  not  answer  the  276th  nor  any  more  difficult 
item.    If  two  subjects  have  distinct  scores,  then  the  two  scores  can  be 
used  to  identify  which  items  both  can  answer  correctly,  which  items  the 
higher  scorer  can  answer  and  the  other  cannot,  and  which  items  neither  can 
answer.    At  the  item  level,  a  Guttman  scale  immediately  indicates  what  a 
person  can  and  cannot  do. 

Of  course,  the  NAEP  reading  exercises  do  not  form  a  Guttman  scale,  and 
it  is  seldom  that  any  real  item  response  data  have  such  ideal  properties- 


385 


402 


ERIC 


Multiple-choice  items  are  especially  unlikely  to  form  a  Guttman  scale  since 
the  correct  answer  can  be  achieved  by  guessing.    Also,  subjects  did  not  all 
receive  the  same  items,  which  would  complicate  the  interpretation  of  the 
Guttman  scale. 

For  NAEP,  we  have  searched  the  reading  data  for  reading  exercises  which 
discriminate  strongly  between  selected  points  on  the  scale,  albeit  these 
exercises  are  not  perfectly  discriminating  as  would  true  be  for  a  Guttman 
scale.  We  have  labeled  these  selected  scale  points  by  attempting  to 
generalize  from  the  highly  discriminating  exercises. 

The  general  procedure  used  is  as  follows: 

(1)  Choose  the  scale  points  to  anchor.    The  selection  of  the 
anchoring  points  is  important  since  few  items  will  be  found 
that  discriminate  between  close  points  and  little  useful 
information  will  be  found  if  the  points  are  far  apart. 

(2)  Select  items  that  discriminate  between  each  pair  of  adjacent 
points.    The  following  criteria  were  used  for  selecting  items 
at  each  anchor  poin^: 

(a)  eighty  percent  or  more  of  the  students  at  that  point 
could  answer  the  item  correctly. 

(b)  less  than  50  percent  of  the  students  at  the  next 
lower  point  could  answer  the  item  correctly.  This 
criterion  does  not  apply  to  the  lowest  valued  anchor 
point . 

Using  these  criteria,  an  item  can  be  select«*d  for 
discriminating  between  only  one  pair  of  adjacent  points. 

(3)  Batch  the  items  found  to  discriminate  between  pairs  of  anchor 
points. 

(4)  Try  to  generalize  from  each  batch  of  items  to  the  level  of 
accomplishment  that  the  items  rep»"esent.  It  is  important  that 
this  step  be  performed  by  experts  in  the  subject  area. 

(5)  Try  to  understand  the  exercises  that  did  not  discriminate 
between  any  pair  of  points.  Exercises  may  fail  for  a  number  ot 
reasons  such  as  measurement  of  another  dimension, 
discrimination  between  points  not  chosen  for  anchoring,  or, 
perhaps,  simply  poor  item  construction. 

The  details  of  the  process  as  implemented  for  the  NAEP  are  as  follows: 

(1)  Anchor  points  were  selected.    For  the  NAEP  reading  scale,  we 
chose  to  anchor  the  following  points:  150,  200,  250,  300,  and 


386 


403 


350.    These  points  span  the  range  in  which  most  subjects 
scored. 

(2)  The  probability  of  obtaining  correct  responses  to  each  of  the 
NAEP  reading  proficiency  exercises  at  each  point  was 
estimated.    This  step  was  done  using  the  parameters  of  the 
item  characteristic  curves  which  were  available  from  the 
LOGIST  (Wingersky,  Barton,  &  Lord,  1982)  and  BILOG  (Mislevy  & 
Bock,  1982)  programs. 

(3)  The  RP  point  at  which  the  probability  of  passing  was  .80  was 
computed.    This  was  computed  using  the  item  parameters  and 
solving  the  equation  of  the  three  parameter  logistic  model  for 
the  value  of  theta  for  which  p^^  =  .80.    The  theta  value  for 
each  item  was  then  transformed"*  to  the  RP  scale.    These  values 
are  called  RP80^. 

Steps  2  and  3  were  computed  by  an  IBM-PC  program  called  Behanc 
(Beaton,  1986). 

(4)  The  items  were  sorted  by  RP80.    This  was  done  to  place  the 
items  in  an  order  of  difficulty.    The  actual  texts  of  the 
items  were  cut  and  pasted  onto  sheets  of  paper  which  were 
placed  in  a  binder  in  RP80  order. 

(5)  The  item  statistics,  including  whether  or  not  they  met  the 
anchoring  criterion,  were  pasted  into  the  item  text  book 
underneath  the  item  text.    Items  meeting  the  discrimination 
criteria  were  highlighted. 

(6)  Red  markers  were  placed  in  the  item  text  book  to  pinpoint  the 
item  with  RP80  value  closest  to  each  anchor  point  and  blue 
markers  were  entered  to  mark  the  mid-points  between  anchor 
points  (e.g.,  175,  225,  etc.), 

(7)  The  item  text  books  were  delivered  to  reading  consultants  who 
were  asked  to  interpret  the  items.    The  meaning  of  the  item 
statistics  was  described.    The  panel  of  subject  matter 
specialists  were  asked  to  look  at  the  batches  of  items  and 
describe  what  students  at  each  level  could  do  that  students  at 
lower  levels  could  not.    We  asked  for  a  description  in  a 
paragraph  or  two,  then  a  summary  sentence,  and,  finally,  a  one 
word  label  for  the  point.    They  were  also  asked  to  select 
several  items,  which  met  the  criteria,  to  serve  as  exemplars 
for  each  anchor  point. 

(8)  In  developing  the  descriptions,  the  reading  consultants  used 
their  expert  judgment  as  well  as  descriptive  statistics  of  the 
passage  and  item  types  to  characterize  the  relationship 
between  the  type  of  question  asked  and  the  text 
characteristics. 


387 


404 


(9)  The  anchor  point  descriptions,  along  with  the  item  examples, 
were  then  reviewed  by       additional  reading  specialists*  The 
descriptions,  sentences,  and  labels  were  revised  incorporating 
suggestions  of  these  reading  specialists. 

(10)  Not  all  items  failing  to  meet  the  discrimination  criteria  were 
studied  to  find  out  exactly  why;  these  were  given  less 
priority  in  selecting  items  for  the  1985-36  assessment. 


The  results  of  the  anchoring  process  were  published  in  The  Reading 
Report  Card;    Progress  Toward  Excellence  in  Our  Schools  (1985,  p.  15)7  The 
description  of  the  anchor  points  is  rep^^-ited  here  as  Figure  10.5-1.  The 
five  Itivels  of  proficiency  were  defined  as  Rudimentary  (150),  Basic  (200), 
Intermediate  (250),  Adept  (300),  and  Advanced  (350).    The  Reading  Report 
Card  also  includes  at  least  two  sample  item5?  for  each  anchor  point. 

The  anchoring  process  allows  the  description  of  what  a  student  can  and 
cannot  do  in  terms  of  levels  of  reading  performance,  not  in  terms  of  what 
other  students  do  or  do  not  do.  We  can  estimate  directly  the  number  or 
proportion  of  students  who  can  perform  at  different  levels,  which  is  the 
sort  of  information  needed  for  policy  action.  For  example,  such  stacements 
as  64.2  percent  of  the  9-year-olds  in  1983-84  could  read  at  the  Basic  level 
but  only  17  percent  could  read  at  the  Intermediate  level  are  possible.  The 
individual  differences  among  students  can  be  described  without  introducing 
the  concepts  of  variance  and  standard  deviation.  Several  tables  containing 
levels  of  proficiency  for  NAEP  students  are  shown  in  Part  III. 


★  ★  ★ 


In  his  early  work  on  the  measurement  of  learning  outcomes,  Glaser 
(1963)  wrote: 

...a  student's  score  on  a  criterion-referenced 
measure  provides  explicit  information  as  to  what  the 
individual  can  or  cannot  do.  Criterion-referenced 
measures  indicate  the  content  of  the  behavioral 
repertory,  and  the  correspondence  between  what  an 
individual  does  and  the  underlying  continuum  of 
achievement,  (p.  519) 

In  this  sense,  the  NAEP  reading  exercises  have  been  used  to  create  a 
criterion-referenced  test.  For  the  reading  scale,  we  present  not  only 
sample  exercises  to  show  vhat  students  can  do  but  also  generalize  to 
classes  of  behaviors. 


This  way  of  presenting  results  does 
that  presenting  the  percent  passing  for 
large  amount  of  information  in  a  simple 


not  contain  all  of  the  information 
each  item  does  but  puts  together  a 
way.    The  percents  passing  are 


388 


40o 


Figure  10.5-1 
Levels  of  Proficiency 

Rudimentary  (150) 

Readers  who  have  acquired  rudimentary  reading  skills  and  strategies  can 
follow  brief  written  directions.    They  can  also  select  words,  phrases,  or 
sentences  to  describe  a  simple  picture  and  can  interpret  simple  written 
clues  to  identify  a  common  object.    Performance  at  this  level  suggests  the 
ability  to  carry  out  simple,  discrete  reading  tasks. 


Basic  (200) 

Readers  who  have  learned  basic  comprehension  skills  and  strategies  can 
locate  and  identify  facts  from  simple  informational  paragraphs,  stories, 
and  news  articles.    In  addition,  they  can  combine  ideas  and  make  inferences 
based  on  short,  uncomplicated  passages.    Performance  at  this  level  suggests 
the  ability  to  understand  specific  or  sequentially  related  information. 

Intermediate  (250) 

Readers  with  the.  ability  to  use  intermediate  skills  and  strategies  can 
search  for,  locate,  and  organize  the  information  they  find  in  relatively 
lengthy  passages  and  can  recognize  paraphrases  of  what  they  have  read. 
They  can  also  make  inferences  and  reach  generalizations  about  main  ideas 
and  author's  purpose  from  passages  dealing  with  literature,  science,  and 
social  studies.    Performance  at  this  level  suggests  the  ability  to  search 
for  specific  information,  interrelate  ideas,  and  make  generalizations. 

Adept  (300) 

Readers  with  adept  reading  comprehension  skills  and  strategies  can 
understand  complicated  literary  and  informational  passages,  including 
material  about  topics  they  study  at  school.    They  can  also  analyze  and 
integrate  ler^s  familiar  material  and  provide  reactions  to  and  explanations 
of  the  text  as  a  whole.    Performance  at  this  level  suggests  the  ability  to 
find,  understand,  summarize,  and  explain  relatively  complicated 
information. 


Advanced  (350) 

Readers  who  use  advanced  reading  skills  and  strategies  can  extend  and 
restructure  the  ideas  presented  in  specialized  and  complex  texts.  Examples 
include  scientific  materials,  literary  essays,  historical  documents,  and 
materials  similar  to  those  found  in  professional  and  technical  working 
environments.    They  are  also  able  to  understand  the  links  between  ideas 
even  when  those  links  are  not  explicitly  stated  and  to  make  appropriate 
generalizations  even  when  the  texts  lack  clear  introductions  or 
explanations.    Performance  at  this  level  suggests  the  abiUty  to  synthesize 
and  learn  from  specialized  reading  materials. 


389 


40o 


still  available  for  interested  researcher^;  as  long  as  they  do  not  reveal 
the  exercises  to  the  public. 

The  anchoring  of  scales  is  not  necessarily  derivative  from  the  IRT 
process,  although  the  IRT  parameters  were  used  for  NAEP,    In  fact,  it  is 
possible  to  attempt  anchoring  using  any  scale  scores  that  are  assigned  to 
the  students.    Whether  or  not  an  item  meets  the  criteria  could  be 
established  directly  by  computing  the  percent  correct  at  or  above  each 
scale  point  for  each  item.    Items  that  met  the  criteria,  if  any,  could  be 
subjected  to  interpretation.    Using  the  IRT  parameter  estimates  is  actually 
more  conservative  than  necessary  since  we  used  the  theoretical  points  at 
which  50  percent  and  80  percent  passed  the  item  for  evaluating  an  item's 
discrimination.  If  the  IRT  model  holds,  far  more  than  80  percent  at  much 
higher  scores  would  pass  the  item  and  fewer  than  50  percent  of  those 
scoring  below  the  next  lower  level  would  pass  the  item.  Scale  anchoring  is, 
therefore,  applicable  to  any  approximately  unidimensional  examination  with 
highly  discriminating  items. 

We  should  also  note  that  there  are  other  ways  of  anchoring  the  scale. 
For  NAEP,  we  started  with  scale  points  and  searched  for  general 
descriptions,  words,  and  items  to  describe  the  scale  points.    This  could  be 
described  as  an  extension  of  the  suggestion  of  Bock,  Mislevy,  and  Woodson 
(1982)  to  label  points  on  a  scale  by  items  which  80  percent  of  the  students 
at  particular  scale  points  could  do.  Another  approach  would  be  to  start 
with  behavioral  descriptions  and  then  look  for  the  point  on  the  scale  which 
corresponded  to  the  description;  Tor  example,  we  might  select  several 
exercises  to  represent  a  level  of  proficiency  and  then  use  some  average 
measure  of  item  difficulty  as  the  point  on  the  scale  representing  that 
proficiency  level. 

We  also  note  that  the  percent  above  a  scale  point  is  not  affected  by 
monotone  transformations  of  the  theta  scale.    Once  the  anchor  points  are 
selected,  any  monotone  transformation  of  the  scale  accompanied  by  the 
corresponding  transformation  of  the  anchor  points  will  not  affect  the 
percentages  at  or  above  particular  points  (see  Goldstein,  1980  for  a  clear 
description  of  the  problem). 


390 


407 


Chapter  11 

THB  WRITING  DATA  ANALYSIS:  INTRODUCTION 

Albert  E.  Beaton 
Educational  Testing  Service 


The  HAEP  has  completed  two  reports  on  writing:    Writing  Trends  Across 
the  Decade,  1974-84  (Applebee,  Langer,  &  Mullis,  1986a)  and  The  Writing 
Report  Card;    WrltTng  Achieveiiient  in  American  Schools,  1984  (Applebee, 
Langer,  &  Nullis,  1986b).  The  purpose  of  this  chapter  of  the  technical 
report  is  to  provide  :he  information  necessary  to  understand  the  properties 
of  the  writing  data,  which  are  available  on  the  public-use  data  tapes,  and 
to  understand  the  analyses  underlying  these  two  reports. 

As  mentioned  in  Chapter  1,  ETS  did  not  propose  to  include  the  writing 
assessment  in  the  spiralled  part  of  the  sample  nor  did  it  intend  to  scale 
the  writing  data,  but  did  indeed  do  both.  The  original  conception  was  to 
present  the  trend  data  exercise  by  exercise  using  the  tape-administered 
assessment  results  or,  if  the  exercises  administered  by  print  could  be 
reasonably  equated  with  those  administered  by  tape  recorder,  present  a 
combination  of  the  two  data  sets.  The  exercise-by-exercise  approach  was 
taken  in  the  analysis  for  Writing;  Trends  Across  the  Decade,  1974-84. 

Although  seemingly  simple  at  first,  the  exercise-by-exercise  approach 
leads  to  complications  in  interpretation.  With  the  NAEP  data,  a  comparison 
of  17-year-olds  of  1983-84  with  their  peers  of  1978-79  would  have  to  be 
based  on  different  exercises  than  a  comparison  of  the  17-year-olds  with  the 
13-year-olds  of  1983-84.  Within  the  exercises  used  in  a  comparison, 
different  exercise  averages  might  move  in  the  same  direction,  but  at 
different  rates.  Different  exercises  have  different  averages;  the  reader 
who  does  not  remember,  for  example,  that  the  13-year-olds  had  an  easier 
exercise  than  the  17-year-olds  may  make  false  generalizations  from  the 
data. 

We  believed  that  a  common  scale  onto  which  all  writing  exercises  could 
be  projected  would  help  in  the  interpretation  of  the  data.  We  strove  to 
develop  an  overall  measure  of  writing  proficiency  that  would  be  comparable 
over  ages  and  times. 

Developing  the  writing  scale  incorporates  much  of  the  individual 
writing  exercise  information.    Information  at  the  exercise  level  is  in  The 
Writing  Report  Card  and  is  available  on  the  public-use  data  tapes  for 
anyone  interested  in  further  research. 


391 


408 


The  development  of  the  vriting  scale  vas  not  simple.  First,  ve  found 
that  changing  from  tape  recorded  to  printed  administration  affected  the 
responses  in  vays  that  precluded  equating  the  results;  thus,  the  trend 
analyses  vere  based  only  on  data  collected  by  tape  recorded  administration. 
Ve  attempted  to  apply  tvo  IRT  models  for  non-binary  data  that  vere  proposed 
by  Masters  (1982),  but  these  models  did  not  provide  acceptable  results  for 
the  NAEP  data.  Finally,  ve  developed  the  Average  Response  Hethoa  (ARM) 
(described  belov)  and  applied  it  to  the  cross- sectional  analyses. 


*  *  * 


Before  describing  the  data  analyses,  it  is  useful  to  reviev  the 
background  of  the  vriting  assessment  data.  In  the  1983-84  vriting 
assessment,  NAEP  used  22  different  vriting  exercises.  Exercises  vere 
designed  to  assess  three  different  types  of  skills:  informational  vriting, 
persuasive  vriting,  and  imaginative  vriting.  The  process  by  vhich  these 
exercises  vere  developed  is  described  in  Chapter  3. 

NAEP  has  conducted  four  assetssments  of  vriting.    Vriting  yas  first 
assessed  in  assessment  Year  1  (1969-70),  th^n  in  Year  5  (1973-74),  and  Year 
10  (1978-79),  and  finally  in  Year  15  (1983-64).    The  NAEP  vriting  exercises 
vere  supplied  to  th^  Educational  Testing  Service  by  the  Education 
Commission  of  the  States,  vhich  had  administered  the  previous  throe  vriting 
assessments.    ETS  selected  the  22  exercises  that  vere  used  in  Year  IS. 
Some  of  these  exercises  had  been  used  in  previous  vriting  assessments  and 
others  had  not.    No  Year  1  exercises  vere  used,  because  all  of  those 
exercises  had  been  previously  released* 

The  scoring  of  these  exercises  is  discussed  in  Chapter  8.2.  All 
exercises  vere  scored  using  the  primary  trait  method  and  a  fev  vere  also 
scored  using  the  holistic  method.  Some  vere  also  scored  on  secondary 
traits.  All  scores  for  the  Year  15  assessment,  including  rescores  for 
reliability  analysis,  are  available  on  the  public-use  data  tapes.  The 
actual  assessment  papers  vere  recovered  for  those  students  vho  had  been 
assessed  in  NAEP  Years  5  and  10  and  had  been  administered  essays  that  vere 
also  administered  in  Year  IS.  These  papers  vere  then  scored  along  vith  the 
essays  vritten  for  the  1983-84  assessment. 

The  changes  in  the  design  of  NAEP  by  ETS  have  had  an  important  effect 
on  the  data  collected.  In  the  earlier  vriting  assessments,  the  vriting 
exercises  vere  administered  using  a  tape  recorder  so  that  students  vere 
instructed  about  tasks  aurally.  The  purpose  of  the  tape  recorder  vas  to 
reduce  dependence  of  subject  area  assessments  on  a  student's  ability  to 
read  an  exercise  and  its  instructions.    The  ETS  design  of  NAEP  led  to 
administering  different  exercises,  perhaps  in  different  subje'-r  areas,  at 
the  same  assessment  session;  thus,  aural  administration  vas  no.  feasible. 
To  avoid  losing  comparability  between  the  Year  lb  and  prior  vriting 
assessments,  tvo  distinct  types  of  assessments  vere  performed,  one  using 
pencil-and-paper  administration  and  the  other  using  tape  recorders.  All  of 


392 


the  22  writing  exercises  were  administered  using  pencil-and-paper  methods; 
some  of  these  were  also  administered  using  the  tape  recorders. 

The  pencil-and-paper  assessment  booklets  were  constructed  using  BIB 
spiralling  where  possible.    The  BIB  spiralling  generated  57  different 
assessment  booklets  at  each  grade/age  level.    The  BIB-spiralled  section  of 
the  pencil-and-paper  assessment  assures  that  each  pair  of  exercises  occurs 
Jointly  in  some  booklet  and  will  be  administered  to  an  equivalent  sample  of 
students.  (BIB  spiralling  is  described  in  Chapter  4.)    However,  some 
reading  and  writing  exercises  took  more  than  fourteen  minutes  to  complete 
and  thus  could  not  fit  into  the  fourteen-minute  BIB  block  structure.  To 
accommodate  these  exercises,  four  additional  assessment  booklets  were 
developed  at  each  grade/age  level.    These  booklets,  which  are  called  the 
unbalanced  incomplete  block  (UBIB)  booklets,  were  spiralled  ijjto  the 
pencil-and-paper  sample  and  thus  administered  to  a  sample  of  students 
equivalent  to  the  sample  that  received  the  BIB  booklets.    The  UBIB  booklets 
lose  the  property  of  having  each  exercise  paired  with  each  other  exercise; 
in  fact,  few  correlations  are  computable  between  exercises  in  different 
UBIB  booklets. 

Another  important  detail  in  understanding  the  writing  data  and  their 
analysis  is  that  the  sample  administered  by  tape  recorder  is  collected  only 
by  age;  the  ]IB  and  UBIB  samples  are  collected  by  both  age  and  grade.  NAEP 
had  sampled  only  by  age  in  the  past;  thus,  the  part  of  the  Year  15 
assessment  that  was  to  be  directly  compared  to  past  assessments  was  sampled 
in  the  same  way.  The  BIB  and  UBIB  samples  contain  students  who  are  either 
9-years-old  or  in  the  fourth  grade,  either  13-years-cld  or  in  the  eighth 
grade,  and  either  17-years-old  or  in  the  eleventh  grade.  Since  only  age- 
eligible  students  were  assessed  using  tape  recorders,  only  age-eligible 
students  were  used  to  compare  methods  of  administration. 

Table  11(1)  summarizes  the  properties  of  the  writing  data.    For  each 
exercise,  the  table  provides: 

*  the  exercise  identification  number  and  a  short  description  of 
the  exercise; 

*  the  type  of  writing  task; 

*  the  assessment  years  in  which  the  exercise  was  administered; 

*  an  indicator  of  whether  the  exercise  was  in  the  BIB  spiral, 
UBIB  spiral,  or  paced  tape  samples; 

*  the  identification  number  of  the  holistic  score,  if  the 
exercise  was  scored  holistically  as  well  as  by  primary  trait; 
and 

*  an  indicator  as  to  whether  the  exercise  was  used  in  computing 
ARM  scale  values.  (The  ARM  scale  is  described  in  Chapter 
11.4.) 


393 


4a  u 


Table  11(1) 


Year  15  NAEP  Writing  Exercises 


Exercise  Tasks 

N000102  DALI^  .  1 

N000202  SCHOOL  RULE"  2 

N000302  RECREATION  OPP.  2 
N000402  FOOD  ON  FRONTIER:  1 

N000502  DISSECTING  FROGS  2 

N000602  XYZ  COMPANY^  1 

N000702  SWIMMING  POOL  2 

N000802  PETS  ,  1 

N000902  RADIO  STATION  2 

N001002  APPLEBY  HOUSE  1 

N007202  HOLE  IN  THE  BOX  3 

N007602  FLASHLIGHT  3 

N007702  GHOST  STORY  3 

N007902  FAVORITE  MUSIC  1 

N008002  SPLIT  SESSION  2 

N014702  PLANTS  1 

N014802  SPACESHIP  2 

NO 14902  AUNT  MAY  2 

NO 18002  SPACE  PROGRAM  2 

N019002  JOB  APPLICATION  1 

N020002  UNCLE  2 

N021002  BIKE  LANE  2 


Age  9  ^ 
Assessment  Year 
5         10  15 

T*  B,T 
B 

B 

B 
B 

T  B 
B 
B 


T  T 


U,T 
U 
U 
U 

B 
B 

U,T 


Age  13 
Assessment  Year 
5         10  15 

T  B,T 
B 
B 
B 
B 

B 
B 

T  B 
B 
B 


T  T 


U,T 

U 

U 

u 

U,T 


Age  17 
Assessment  Year 
5         10  15 

T  B 
B 
B 
B 


T  T 


B 
B 

B 
B 


N000108 


B 
B 

U,T  N007208 

U 

U 

U 

U,T  N008008 


N014909 


4ear  5=1973-74,  Year  10=1978-79,  Year  15=1983-84 

^Types  of  writing  tasks:    l=informative,  2=persuasive,  3=imaginative 
^Included  in  the  ARM 

*T=administered  by  tape  recorder,  age  data  only;  B=administered  in  BIB  spiral  blocks,  age  and  grade 
data;  Uaadministered  in  other  blocks,  age  and  grade  data 


ERIC 


394 


411 


Much  more  information  about  the  exercises  is  available  In  Chapter  3. 
The  actual  exercise  text  is  available  on  the  microfiche  accompanying  the 
public-use  data  tapes. 

In  addition  to  the  responses  to  the  writing  exercises,  analyses  for  the 
writing  reports  also  include  a  number  of  specific  questions  about  students' 
attitudes  toward  writing  and  their  writing  practices,  A  brief  discussion  of 
these  items  Is  included  in  Chapter  6;  they  are  discussed  more  fully  In  the 
reports  in  which  they  are  used. 

The  next  four  chapters  of  this  technical  report  are  summarized  below. 
11.1    The  Writing  Exercise  Data 

This  section  contains,  among  other  things,  the  average 
values  and  standard  deviations  of  the  writing  exercises  and 
the  Inter-rater  reliability  coefficients. 


11.2    The  Effect  of  Mode  of  Exercise  Administration  (BIB  Spiral  or 
Paced  Tape)  on  Estimates  of  Writing  Performance 

This  section  shows  the  differences  in  responses  between  the 
sample  administered  by  pencil-and-paper  and  that 
administered  by  tape-recorded  procedures.  The  comparison 
shows  better  average  performance  at  all  three  age  levels 
when  the  exercises  are  administered  by  tape  recorder.  The 
benefit  attributable  to  tape-recorded  administration  appears 
to  vary  both  by  deipographic  subgroup  and  writing  exercise. 
As  a  result,  it  was  decided  not  to  attempt  to  merge  the  data 
collected  by  the  two  methods. 


11.3    Estimation  of  Trends  in  Writing  Achievement 

Because  the  amount  of  trend  data  was  insufficient  to  support 
scaling,  and  because  the  pencil-and-paper  data  could  not  be 
merged  with  the  data  collected  at  tape  recorder  sessions, 
the  analysis  of  trend  data  was  based  only  on  individual 
essays  that  were  administered  in  different  assessment  years 
and  which  were  also  administered  by  tape  recorder  in  Year 
15.  The  statistical  considerations  in  the  trend  analysis  are 
discussed  in  this  section. 


11.4    The  Average  Response  Method  (ARM)  of  Scaling 

Some  of  the  writing  data  were  scaled  using  the  average 
response  method.  The  underlying  assumptions  and  derivations 
as  well  as  the  computation  of  plausible  values  are  presented 
in  this  section.  The  potential  bias  due  to  model 
mis-speclf Ication  is  also  discussed  and  an  alternative 


395 


ERIC 


412 


method  is  given  which  is  unbiased,  but  not  as  general  in 
application.  The  two  statistical  procedures  are  compared 
using  the  NAEP  writing  data  and  the  results  are  presented. 
It  is  our  opinion  that  the  ARM  is  a  useful  tool  for 
estimation  and  interpretation  and  is  a  promising  tool  for 
future  data  analytic  work. 


396 


Chapter  11.1 
THE  WRITING  EXERCISE  DATA 


Albert  E.  Beaton 
Educational  Testing  Service 


The  purpose  of  this  section  is  to  provide  some  basic  information  about 
the  writing  data.  All  data  were  rated  by  professional  judges  and  the 
details  of  the  scoring  process  are  given  in  Chapter  8, 2.  The  same  scoring 
protocols  were  used  for  all  three  grade/age  combinations  and  were  applied 
to  the  data  from  past  assessments  as  well.  Included  here  is  information 
about: 

*  the  rater  reliability; 

*  the  scale  drift  during  the  rating  process;  and 

*  the  basic  descriptive  statistics  for  each  exercise. 

Other  information  about  the  writing  data  can  be  found  in  Chapters  11.2, 
11.3,  and  11.4. 


11.1.1    Inter->Rater  Reliability 

Since  the  individual  essays  were  rated  by  professional  judges,  the 
question  of  the  consistency  of  judges  must  be  addressed.  To  do  so,  we 
performed  an  analysis  of  the  inter-rater  reliability.  A  20  percent  3ample 
of  the  essays  was  selected  and  independently  rated  by  a  second  scorer. 
These  multiply-rated  essays  form  the  basis  of  the  inter-rater  reliability 
analysis,  the  results  of  which  are  shown  in  Table  11.1(1). 

Two  statistics  were  chosen  for  presentation:    the  percent  of  exact 
agreement  and  the  reliability  coefficient.    The  percent  of  exact  agreement 
is  the  percentage  of  times  that  the  two  scorers  agreed  exactly  in  their 
ratings.    The  reliability  coefficient  is  the  intra-class  correlation  among 
raters. 

The  results  for  both  primary  trait  and  holistic  scorings  are  shown  in 
Table  11.1(1).  For  each  grade/age  combination,  the  number  of  responses 
analyzed  is  shown.  The  next  column  is  the  number  of  times  the  two  scores 
agreed  exactly  in  their  ratings.  The  third  column  is  the  reliability 
coefficient. 


397 


ERLC 


414 


Table  11.1(1) 


Percentages  of  Exact  Score  Point  Agreement  and  Intra-class  Correlation  Coefficients 
for  Primary  Trait  Scoring,  Year  15  (Possible  Score  Range:    0  to  4) 


Writing  Tasks 

Informative  Writing 
Pets 

Job  Application 
Plants 

Appleby  House 
XYZ  Company 
Dali 

Favorite  Music 

Food  on  the  Frontier 

Persuasive  Writing 
School  Rule 
Dissecting  Frogs 
Swimming  Pool 
Split  Sessions 
Space  Ship 
Space  Program 
Recreation  Opportunity 
Radio  Station 
Aunt  May 
Uncle 
Bike  Lane 

Imaginative  Writing 
Hole  in  the  Box 
Flashlight 
Ghost  Story 


  Grade  4   

N       Agreement  Coefcnt. 


534  92.3  .88 

0  - 

402  92.1  .93 

635  89.6  .92 

506  93.1  .92 

396  90.9  .88 

434  93.4  .89 

440  92.5  .89 


479  91.6  .88 

0  - 

535  90.8  .89 

0  - 

506  88.1  .90 

0  - 

0  - 

639  95.7  .97 

434  91.6  .92 

0  - 

0  - 


424  91.5  .89 
445  92.9  .91 
435        93.3  .89 


  Grade  8   

N       Agreement    Coefcnt . 


524  84.4  .78 

0  - 

0  - 

719  79.0  .84 
466  89.9  .86 
468  82.0  .81 
528  84.4  .67 
460  82.2  .76 

4;-!3  81.4  .70 

46C  78.0  .71 

523  83.9  .82 

432  84.4  .80 

0  - 

0  - 

452  86.4  .87 

720  84.2  .88 


461  82.6  .86 
436  80.9  .79 
528        83.1  .85 


  Grade  11  

N     Agreement  Coefcnt. 


0  - 

497  91.1  .92 

0  - 

715  89.4  .92 

0  - 

449  91.3  .92 

499  95.0  .90 

487  92.6  .90 


527  92.5  .91 

523  90.9  .91 

461  88.4  .88 

495  90.2  .92 

478  89.9  .92 


523  89.3  .90 

720  88.5  .91 


504  91.1  .92 

463  92.3  .91 

498  91.1  .93 


398 


4lo 


These  results  show  a  very  high  degree  of  agreement  between  the  raters. 
Table  11.1(2)  summarizes  the  statistics  by  grade. 

For  Grades  4  and  11,  no  exercise  had  less  than  88  percent  exact 
agreement;  some  exercises  had  agreement  over  95  percent.  The  reliability 
coefficients  are  also  high,  ranging  from  .88  to  .97. 

The  reliability  for  Grade  8  is  quite  acceptable,  but  not  as  high. 
Percents  of  exact  agreement  range  from  78.0  to  89.9  and  reliability 
coefficients  from  .67  to  .88.  The  lower  values  for  Grade  8  may  be  related 
to  the  fact  that  because  the  eighth  graders  were  assessed  first.  In  the 
fall  of  1983,  the  scorers  were  less  experienced  when  these  papers  were 
rated. 

Table  11.1(3)  shows  the  percents  of  exact  agreement  and  reliability 
coefficients  for  the  exercises  that  were  used  in  the  trend  report.  These 
essays  were  scored  for  the  first  time  to  estimate  trends  and  are, 
necessarily,  reported  by  ages,  not  grades,  since  past  data  were  collected 
only  by  age.  These  results  are  also  quite  good. 


11.1.2    Batching  Effect 

As  mentioned  above,  the  writing  samples  were  rated  as  they  were 
received  by  the  scorers  with  the  result  that  the  eighth  graders  were  rated 
first,  the  fourth  graders  next,  and  the  eleventh  graders  last.  Since  there 
were  so  many  essays  to  score,  waiting  until  all  writing  samples  were 
collected  and  then  intermingling  them,  so  that  all  grades  would  be  rate-^  at 
the  same  time,  would  have  caused  serious  delays  in  reporting  the  writing 
results. 

The  rating  of  the  different  grades  separately  and  serially  led  to  a 
concern  about  a  drift  in  the  rating  scale  throughout  the  rating  process.  To 
examine  the  size  of  the  drift,  if  one  existed,  an  experiment  on  the  effect 
of  batching  was  performed. 

The  experiment  was  designed  and  carried  out  by  Zwick.  In  summary,  three 
essays  which  were  administered  in  all  three  grades  were  selected.  These 
three  essays  were  contained  in  one  booklet.  Half  of  the  booklets  were 
retrieved,  with  resulting  sample  sizes  of  156  cases  for  Grade  4/Age  9,  174 
cases  for  Grade  8/Age  13,  and  173  cases  for  Grade  11/Age  17. 

These  booklets  were  randomly  permuted  and  then  blindly  re-rated;  that 
is,  the  re-raters  were  given  neither  the  age  or  grade  of  a  respondent  nor 
the  previous  rating  of  an  exercise.  The  re-raters  were  selected  from  the 
pool  of  original  raters.  After  the  re-rating,  the  original  rating  and  the 
re-rating  were  compared  using  a  three-way  (Grade/Age  x  Exercises  x  Time) 
repeated  measures  analysis  of  variance.  It  was  decided  before  the  analysis 
that  a  batch  effect  of  less  than  a  tenth  of  a  score  point  was  ignorable. 
The  estimated  batch  effects  were  .01  for  Grade  4/Age  9,  -.04  for  Grade 
8/Age  13,  and  .03  for  Grade  11/Age  17.    These  batch  effects  were  not 
statistically  significant  at  the  .05  level. 


399 


417 


Table  11.1(2) 
Reliability  Statistics  for  Primary  Trait  Ratings 


Number  of  Low  High 

Grade  Exercises*  Percent  Percent  Low  r  High  r 

4  15  88.1  95.7  .88  .97 

8  15  78.0  89.9  .67  .88 

11  15  88.4  95.0  .88  .93 


*A1 though  there  were  22  writing  exercises  over  all  grades,  only  15 
were  administered  at  each  grade. 


400 


o  41a! 


ERIC 


Table  11.1(3) 

Percentages  of  Exact  Score  Point  Agreement  and  Intra-c"  .ss  Coi. relation  Coefficients 

for  Primary  Trait  Scoring  Conducted  in  198:'-8A 


1974  Papers  1979  papers  1984  Papers 

N     Agreement    Coefficient  N     Agreement  Coefficient 


Age  9 

N 

Agreement 

Coefficient 

Hole  in  the  Box 
Dali 

Aunt  May 

501 

0 

0 

92% 

.90 

Age  13 

Hole  in  the  Box 
Dali 

Split  Sessions 

505 

0 

0 

85 

.82 

Age  17 

Hole  in  the  Box 
Dali 

Split  Sessions 

459 

0 

0 

90 

.90 

497 
509 
512 


563 
535 
574 


547 
501 
555 


93% 

.89 

289 

90% 

.86 

88 

.83 

283 

90 

.83 

88 

.89 

283 

92 

.95 

85 

.83 

282 

78 

.79 

90 

.86 

274 

78 

.73 

90 

.84 

275 

87 

.79 

89 

.89 

332 

92 

.91 

90 

.85 

337 

90 

.89 

91 

.89 

335 

89 

.91 

401 


419 


More  detail  about,  and  other  analyses  of,  the  data  rollected  for  thi-j 
experiment  are  provided  in  a  supplementary  paper  by  Zvick  (1986b). 


As  a  result  of  this  experiment,  it  was  decided  to  use  the  original 
scorings  without  any  adjustment  for  batching- 

11.1.3    Descriptive  Statistics 

Table  11.1(A)  contains  the  number  of  students  who  responded  to  each 
writing  exercise  as  veil  as  the  mean  and  standard  deviation  of  the  ratings, 
These  statistics  are  presented  for  the  three  grade  samples  only.  The 
sampling  weights  were  used  in  calculating  the  means  and  standard 
deviations. 


402 


ERIC 


Table  11.1(4) 


Number  of  Students  Responding  to  Each  Writing  Exercise  with  Rating  Mean 
and  Standard  Deviation  (Possible  Score  Range:    0  to  A) 


Grade  4 


Grade  8 


Variable 

N 

Mean 

Std.  Dev. 

N 

Hean 

Std.  Dev 

N000102 

1810 

1 

.3610 

0 

.  o7/o 

1970 

1 

.8949 

0 

.7267 

N0O0202 

2018 

1 

.6363 

u 

.6029 

2253 

1 

.9808 

0 

.5926 

NOO0302 

0 

- 

2234 

1 

.6082 

0 

.7751 

N000402 

1844 

1 

.3751 

0 

.6095 

2236 

2 

.0015 

0 

.6680 

NOOO502 

0 

- 

— 

2339 

2 

.0276 

0 

.6260 

N0O0602 

1770 

1 

.7714 

0 

.9676 

2229 

2 

.5226 

0 

.7474 

N000702 

2027 

1 

.5156 

0 

.6911 

2341 

1 

.7850 

0 

.6864 

NOO0802 

1698 

1 

.7073 

0 

.5960 

2190 

2 

.153<r 

0, 

7259 

N0O0902 

2066 

1 

.5686 

0 

.8433 

2305 

2 

.0762 

0, 

8871 

N001002 

1497 

1 

.8914 

u 

8248 

2040 

2 

.4325 

0, 

8015 

N0O7202 

2139 

1 

.3386 

U 

.09-)0 

2294 

1 

.7718 

0, 

8994 

N0O7602 

2018 

1 

.7443 

u. 

2286 

2 

.2248 

0. 

6998 

N007702 

2119 

1 

.8467 

n 
u 

2336 

2 

u. 

7986 

N0O7902 

1564 

1 

.5294 

0. 

5760 

1990 

1 

.8863 

0. 

5335 

N008002 

0 

2330 

1 

.4060 

0. 

7146 

NO 14 702 

2029 

2 

.2458 

0. 

7478 

0 

N014802 

2026 

1 

.8456 

0. 

8709 

0 

N014902 

2102 

1 

.6874 

0. 

9646 

0 

NO 18002 

0 

0 

NO 19002 

0 

0 

N020002 

0 

0 

N021002 

0 

0 

N000108 

2004 

2 

.6170 

1. 

3651 

2266 

2, 

9622 

1. 

2648 

N007208 

2138 

2 

.5355 

1. 

4528 

2294 

2, 

7952 

1. 

5562 

N008008 

0 

2328 

2. 

819 

1. 

2713 

N014909 

2103 

2 

.5873 

1. 

3299 

0 

N 

2268 
2370 
2357 
2373 
0 
0 

2400 
0 
0 

2050 
2469 
2362 
2429 
2127 
2376 
0 
0 
0 

2440 
2325 
2156 
2433 
2379 
2469 
2376 
0 


-  Grade  11   

Hean       Std.  Dev. 


2.1632 
2.1320 
1.9565 
2.1064 


1.9511 


2.4953 
1.3035 
2.2833 
2.3175 
1.8761 
1.7082 


2.0740 
2.4733 
1.9478 
1.9333 
3.4443 
3.2046 
3.3319 


0.7768 
0.6442 
0.8116 
0.6974 


0.7274 


0.8320 
0.8^-63 
0.7081 
0.9324 
0.5272 
0.7948 


0.8368 
0.9226 
0.8013 
0.8653 
1.3081 
1.5125 
1.3617 


403 


ERIC 


4 


Chapter  11«2 


THE  EFFECT  OF  MODE  OF  ITEM  ADMINISTRATION  (BIB  SPIRAL  OR  PACED  TAPE) 
ON  ESTIMATES  OF  WRITING  PERFORMANCE^ 


Eugene  G.  Johnson 
Educational  Testing  Service 


The  Year  15  NAEP  writing  assessment,  the  fourth  such  assessment  in  the 
history  of  NAEP,  is  the  first  writing  assessment  in  which  the  Balanced 
Incomplete  Block  spiral  design  was  used  for  assigning  exercises  to 
students.    In  the  three  earlier  writing  assessments.  Year  1  (1969-70), 
Year  5  (1973-74)  and  Year  10  (1978-79),  the  total  battery  of  writing  items 
was  divided  into  a  number  of  mutually  exclusive  booklets,  called  packages, 
and  each  such  package  was,  in  turn,  administered  to  a  nationally 
representative  sample  of  students.    While  this  matrix  design  allows 
analysis  of  the  interrelatiouships  between  exercises  appearing  in  the  same 
package,  the  interrelationships  between  exercises  in  different  packages 
cannot  be  readily  estimated,  because  no  student  was  administered  more  than 
one  of  the  packages. 

The  Year  15  NAEP  design  has  remedied  this  deficiency  through  a  complex 
variant  of  mar:ix  sampling  called  balanced  incomplete  block  (BIB) 
spiralling.    Details  of  this  procedure  appear  in  Chapter  5.    In  brief,  the 
total  assessment  battery  (of  both  reading  and  writing  exercises)  was 
divided  into  item  blocks  requiring  an  assessment  time  of  fourteen  minutes. 
Each  of  these  blocks  was  then  assigned  to  57  assessment  booklets  in  such  a 
manner  that  each  booklet  consisted  of  three  blocks  and  each  block  of 
exercises  was  paired  with  every  other  block  in  at  least  one  of  the 
booklets.    Since  some  writing  items  required  a  response  time  longer  than 
the  fourteen  minutes  permitted  in  the  BIB  design,  six  special  booklets,  the 
unbalanced  incomplete  block  (UBIB)  booklets,  were  created  to  accommodate 
these  items.    Each  UBIB  booklet  consisted  of  one  "double  block",  containing 
a  longer  item  and  requiring  30  minutes  of  testing  time,  and  one  of  the 
regular  BIB  blocks. 

The  total  set  of  BIB  and  UBIB  booklets  were  then  spiralled,  cycling  the 
booklets  for  administration  so  that,  typically,  no  two  students  in  any 
assessment  session  in  a  school  received  the  same  booklet.  More 
importantly,  every  item  block  and  every  pair  of  item  blocks  (within  the  BIB 
portion  of  the  assessment)  was  administered  to  a  representative  sample  of 


The  statistical  programming  for  this  section  was  provided  by  Bruce 
Kaplan.    The  figures  were  produced  by  Ira  Sample. 


405 


ERIC 


42i 


students,  enabling  the  examination  of  interrelationships  between  all  items 
encompassed  by  the  BIB  blocks.  (For  UBIB  booklets,  interrelationships  can 
only  be  directly  estimated  for  certain  of  the  items). 

The  change  to  the  BIB  spiralling  design  results  in  improved  sampling 
efficiency  and  analysis  potential,  but  at  a  cost.    Prior  to  the  Year  15 
assessment,  assessments  of  writing  (and  all  other  areas)  were  accompanied 
by  paced  audiotapes  of  the  exercise  stimuli.    The  advantage  of  such  a  mode 
of  administration  is  that  it  allows  for  the  separation  of  reading  ability 
from  the  subject  area  being  assessed.    In  paced  administrations  of  the 
writing  assessment,  the  instructions  for  the  exercise  are  read  aloud  so 
students  can  respond  to  the  exercise  even  though  they  may  have  difficulty 
reading  the  instructions.    This  type  of  administration  was  possible  because 
all  students  in  a  particular  paced  assessment  session  received  the  same 
package.  Because  each  student  in  a  BIB  spiralled  assessment  session  has 
typically  received  a  different  booklet,  it  is  not  possible  to  accompany  a 
BIB  spiralled  assessment  session  with  paced  audiotapes. 

To  determine  the  etfect  of  this  change  in  mode  of  administration  (from 
paced  to  BIB  spiralled)  on  estimates  of  writing  achievement,  a  selected 
subset  of  writing  exercises  was  administered  both  as  part  of  the  primary 
BIB  spiralled  assessment  and  as  part  of  a  much  smaller  paced  tape 
assessment.    The  Year  15  paced  tape  assessment  was  also  designed  to 
ascertain  the  effect  of  change  in  mode  of  administration  on  estimates  of 
reading  achievement  (the  results  of  this  are  reported  in  Section  10.3.6). 
This  portion  of  the  Year  15  assessment  of  reading  and  writing  was  based  on 
an  additional  administration  of  approximately  one  third  of  the  reading  and 
writing  exercises  by  the  previously  used  paced  tape  procedures. 

The  exercises  to  be  administered  by  paced  tape  procedures  at  a  given 
age  were  divided  into  four  distinct  packages.  Each  package  was  then 
administered  to  a  probability  sample  of  students  representative  of  the 
nation.    Between  1,300  and  1,600  students  responded  to  each  of  these 
packages.    Each  of  the  paced  tape  packages  was  administered  in  exactly  the 
same  manner  as  the  paced  administrations  in  past  assessments. 

Because  writing  exercises  generally  require  more  response  time  than 
reading  exercises,  fewer  writing  exercise.s  could  be  chosen  as  a  part  of  the 
BIB~pace  comparison.    Three  writing  exercises  were  chosen  for  this  purpose 
at  each  assessment  age  level.    The  criteria  for  selection  were: 

(1)  The  exercises  had  to  have  been  administered  in  the  previous 
(Year  10)  writing  assessment  and,  if  possible,  also  in  the 
Year  5  assessment. 

(2)  The  exercises  were  to  be  representative  of  each  of  the  three 
major  purposes  of  writing  as  measured  by  the  informative, 
persuasive  and  imaginative  tasks. 

(3)  Subject  to  1  and  2,  each  of  the  exercises  was  to  be  given  to 
more  than  one  age. 


A06 


425 


The  result  of  this  selection  are  the  four  writing  exercises  shown  in 
Table  11.2(1).  Of  these  four  exercises,  two  were  assigned  to  all  three  ages 
(••Hole  in  the  Box"  and  "Dali").    One  of  the  remaining  two,  "Split  Session", 
had  been  given  at  ages  13  and  17  only  and  was  replaced  by  "Aunt  May"  for 
the  age  9  comparison.    One  exercise,  "Hole  in  the  Box",  an  imaginative 
task,  was  presented  in  both  the  Year  5  and  Year  10  assessments.  The 
remaining  three  exercises,  the  informative  task  "Dali"  and  the  persuasive 
tasks  "Split  Session"  and  "Aunt  May",  were  previously  presented  only  in  the 
Year  10  assessment. 

The  assignment  of  the  exercises  to  the  paced  tape  packages  is  also 
shown  in  Table  11.2(1).    Of  the  four  packages  administered  at  a  given  age, 
two  included  writing  exercises.    One  of  these  packages,  P2,  included  two 
writing  exercises~"Dali"  (at  Age  9)  and  either  "Aunt  May"  or  "Split 
Session"  (at  Ages  13  and  17).    Consequently,  the  estimates  for  these 
exercises  are  based  on  the  same  sample  of  students  in  a  given  age.  The 
remaining  package,  P4,  included  a  single  exercise,  "Hole  in  the  Box"; 
estimates  for  this  exercise  for  an  age  are  based  on  a  different,  but 
randomly  equivalent,  subsample  of  students. 

The  responses  to  the  exercises  from  both  modes  of  administration  were 
professionally  scored  for  task  accomplishment  (primary  trait  scoring).  (A 
discussion  of  professional  scoring  is  provided  in  Chapter  8.2.)    The  five 
levels  of  proficiency  used  to  categorize  the  responses,  along  with  their 
numeric  codes,  are: 

0:  unrateable 

1:  unsatisfactory 

2:  minimal 

3:  adequate 

4:  elaborated 

Assessment  results  are  reported  both  in  terms  of  the  proportion  of 
students  whose  writing  reaches  or  exceeds  a  given  level  of  proficiency  and 
in  terms  of  mean  proficiency  levels. 

Tables  11.2(2)  through  11.2(10)  show  the  comparison  of  the  estimates  of 
writing  proficiency  for  the  BIB  and  paced  modes  of  administration  by  age 
and  for  a  selected  set  of  demographic  subgroups  within  each  age.  Each 
table  shows  both  the  estimated  percent  of  students  of  a  given  type  scoring 
at  or  above  the  minimal  (2)  proficiency  level  and  the  estimated  mean 
proficiency  level  for  the  subgroup.    The  numbers  in  parentheses  are  the 
estimates  of  the  sampling  standard  errors  of  these  proficiency  estimates. 
Also  included  are  the  differences  in  proficiency  level  between  the  BIB  and 
paced  modes  of  administration  (DIFFER),  accompanied  by  a  standard  error. 

Figures  11.2-1,  11.2-2  and  11.2-3  are  plots  by  subgroup  and  age  of  the 
differences  between  the  percent  at  or  above  minimum  proficiency.  In 
general,  the  previous  writing  assessment  procedures  using  paced  audiotapes 
are  significantly  less  difficult  for  a  student  than  the  BIB  spiralled 
orocedure.  which  relies  on  a  student's  ability  to  read  and  understand  the 
instructions  given  by  the  writing  prompt.    In  every  case  where  there  is  a 


407 


ERIC 


significant  difference,  responses  were  rated  better  for  the  paced  mode  of 
administration. 


Furthermore,  the  effect  of  mode  of  administration  is  differential  in 
that  differences  in  performance  levels  are  greater  for  some  subgroups  than 
for  others.    The  effect  of  mode  of  administration  also  varies  from  item  to 
item  within  subgroup. 

Because  of  the  differential  effect  of  mode  of  administration  across 
items  and  subgroups,  it  was  felt  that  the  responses  to  the  BIB  and  paced 
modes  of  administration  could  not  be  reliably  equated.    This  has  important 
consequences  in  the  measurement  of  trends  in  writing  achievement  across 
time.    These  consequences  are  discussed  in  the  following  chapter. 


A08 


Table  11.2(1) 

Writing  Exercises  Selected  for  the  BIB/Pace  Comparison 

Exercise                 Task               Ages  Assessment  Years  Package 

Hole  in  the  Box       Imaginative       9,  13,  17  Years  5,  10,  15  P4 

Dali                        Informative       9,  13,  17  Years  10,  15  P2 

Aunt  May                 Persuasive              9  Years  10,  15  P2 

Split  Session          Persuasive           13,  17  Years  10,  15  P2 


409 

O 

ERIC 


Table  11.2(2) 

Effect  of  Node  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Aunt  May" 


~  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


N 


BIB  1960  1.61( 
PACED  1356  1.90( 
DIFFER  -0.29( 


BIB  1316  1.71( 

PACED  869  2.01( 

DIFFER  -0.30( 

BIB  276  1.24( 

PACED  223  1.57( 

DIFFER  -0.33( 

BIB  288  1.34( 

PACED  203  1.55( 

DIFFER  -0.21( 


BIB  125  1.41( 

PACED  76  1.68( 

DIFFER  -0.27( 

BIB  378  1.62( 

PACED  280  1.88^ 

DIFfSR  -0.26( 

BIB  715  1.76( 

PACED  472  2.01( 

DIFFER  -0.24( 

BIB  713  1.48( 

PACED  514  1.85( 

DIFFER  -0.37( 


AN 

%  > 

=  2 

0.03) 
0.04) 
0.05)* 

44.99( 
58.24( 
-13.24( 

1.30) 
2.02) 
2.40)* 

0.03) 
0.05) 
0.06)* 

49.75( 
63.64( 
-13.89( 

1.36) 
2.34) 
2.70)* 

0.06) 
0.06) 
0.09)* 

28.30( 
43.69( 
-15.39( 

2.92) 
3.69) 
4.70)* 

0.05) 
0.10) 
0.12) 

33.24( 
40.93( 
-7.69( 

3.20) 
5.36) 
6.24) 

0.07) 
0.10) 
0.12)* 

40.46( 
45.72( 
-5.26( 

5.42) 
5.52) 
7.73) 

0.05) 
0.06) 
0,08)* 

43.36( 
54.96( 
-11.60( 

2.92) 
3.02) 
4.20)* 

0.05) 
0.05) 
0.07)* 

52.83( 
63.89( 
-11.06( 

^.39) 
2.34) 
3.34)* 

0.04) 
0.06) 
0.07)* 

39.19( 
57.21( 
-18.01( 

2.09) 
3.05) 
3.70)* 

*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


410 


ERIC 


42d 


Table  11.2(2) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Aunt  May" 


N 


MEAN 


%  >=  2 


SIZE/TYPE  OF  COMMUNITY 


Disadvantaged  Urban 

BIB 

! PACED 
DIFFER 

246 
194 

1.27( 
1.67( 
-0.39( 

0.05) 
0.07) 
0.09)* 

31.56(  2.37) 
46.94(  2.90) 
-15.39(  3.74)* 

Advantaged  Urban 

BIB 

!  FACED 
JIFFER 

247 
183 

1.85( 
2.27( 
-0.42( 

0.09) 
0.07) 
0.12)* 

55.51(  4.47) 
74.79(  2.33) 
-19.27(  5.04)* 

GRADE 

<  Modal  Grade 

BIB 

PACED 

DIFFER 

576 
458 

1.26( 
1.58( 
-0.33( 

0.05) 
0.07) 
0.08)* 

29.00(  2.16) 
44.42(  3.92) 
-15.36(  4.48)* 

At  Modal  Grade 

BIB 

PACED 

DIFFER 

1378 
893 

1.71( 
2.05( 
-0.34( 

0.03) 
0.05) 
0.06)* 

^•9.84(  1.40) 
64.98(  2.17) 
-15.14(  2.58)* 

>  Modal  Grade 

!BIB 
! PACED 
DIFFER 

6 
5 

2.09( 
2.22( 
-0.13( 

0.77) 
0.37) 
0.85) 

43.94(28.29) 
91.29(10.05) 
-47.35(30.02) 

*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution--standard  errors  are  poorly  estimated. 


411 


4oU 


Table  11.2(3) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Dali" 


N 


MEAN 


%  >=  2 


—  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


BIB  1680 
PACED  1356 
DIFFER 


BIB  1132 
PACED  869 
DIFFER 


BIP 

PACED 

DIFFER 

BIB 

PACED 

DIFFER 


BIB 

PACED 

DIFFER 


243 
223 


227 
203 


95 
76 


BIB  332 

PACED  280 
DIFFER 

BIB  622 

PACED  472 
DIFFER 


BIB 

PACED 

DIFFER 


615 
bl4 


1.32(  0.02) 
1.57(  0.03) 
-0.24(  0.03)* 


1.39(  0.02) 
1.62(  0.03) 
-0.23(  0.04)* 

1.08(  0.05) 
1.35(  0.05) 
-0.27(  0.07)* 

1.21(  0.06) 
1.44(  0.05) 
-0.24(  0.08)* 


1.14(  0.07) 
1.51(  0.07) 
-0.37(  0.10)* 

1.19(  0.04) 
1.53(  0.04) 
-C.34(  0.06)* 

1.45(  0.03) 
1.65(  0.05) 
-0.19(  0.06)* 

1.29(  0.03) 
1.53(  0.03) 
-0.24(  0.04)* 


39.12(  1.38) 
55.46(  1.89) 
-16.33(  2.34)'' 


43.52(  1.70) 
59.58(  2.45) 
-16.06(  2.98)'' 

24.79(  3.38) 
38.64(  4.41) 
-13.86(  5.55)'' 

31.16(  4.55) 
46.57(  3.87) 
-15.42(  5.97)'' 


27.68(  4.78) 
51.65(  6.05) 
23.98(  7.71)^ 


30.89(  2.87) 
54.37(  3.94) 
-23.47(  4.87)^ 

47.85(  2.37) 
61.04(  2.91) 
-13.20(  3.76)^ 

36.78(  2.06) 
52.53(  2.55) 
-15.75(  3.28)^ 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


412 


Table  11.2(3) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Dali" 


SIZE/TYPE  OF  COMMUNITY 
Disadvantaged  Urban 

Advantaged  Urban 

GRADE 

<  Modal  Grade 

At  Modal  Grade 

>  Modal  Grade 


N 


BIB  202 

! PACED  194 
DIFFER 

BIB  218 

! PACED  183 
DIFFER 


BIB  459 

PACED  458 
DIFFER 

BIB  1214 

PACED  893 
DIFFER 

!BIB  7 

! PACED  5 
DIFFER 


MEAN 


1.17(  0.06) 
1.40(  0.06) 
-0.23(  0.09)* 

1.48(  0.07) 
1.73(  0.04) 
-0.24(  0.08)* 


1.06(  0.03) 
1.36(  0.04) 
-0.30(  0.05)* 

1.39(  0.03) 
1.67(  0.03) 
-0.28(  0.04)* 

1.96(  0.16) 
1.73(  0.27) 
0.24(  0.32) 


%  >=  2 


26.20(  4.35) 
41.56(  5.38) 
-15.37(  6.92) 

47.95(  3.27) 
68.21(  3.33) 
-20.26(  4.67)* 


20.56(  2.27) 
38.21(  3.11) 
-17.64(  3.85)* 

43.84(  1.63) 
64.01(  2.03) 
-20.17(  2.60)* 

84.60(11.23) 
7?.. 56(27. 23) 
12.03(29.45) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution — standard  errors  are  poorly  estimated. 


413 


Table  11.2(4) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Hole  in  the  Box" 


N 


MEAN 


—  TOTAL  — 

ETHNICITY /RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.3. 

Gr-aduatec?  H.S.; 

Poit  H.S. 

Unknown 


BIB  2029 
PACED  1344 
DIFFER 


BIB 

PACED 
DIFFER 


1345 
832 


BIB  308 

PACED  178 
DIFFER 

BIB  273 

PACED  263 
DIFFER 


BIB  121 

PACED  lOA 
DIFFER 

BIB  392 

PACED  277 
DIFFER 


BIB 

PACED 
DIFFER 

BIB 

PACED 
DIFFER 


724 
453 


773 
495 


1.30(  0.02) 
1.55(  0.03) 
-0.25(  0.04)* 


1.34(  0.03; 
1.58(  0.03) 
-0.24(  0.04)* 

1.19(  0.05) 
1.45(  0.U9) 
-0.25(  0.10)* 

1.22(  0.07) 
1.46(  0.07) 
-0.?.3(  0.10)* 


1.21(  0.09) 
1.50(  0.05) 
-0.29(  0.10)^' 

1.29(  C.04) 
1.47(  0.05) 
-0.18(  0.06)* 

1.40(  0.04) 
1.65(  0.04) 
-0.26(  0.0^^* 

1.25(  0.03) 
1.52^  0.04) 
-0.28(  0.05)* 


37.21(  1.59) 
54.56(  1.97) 
-17.34(  2.53)* 


38.47(  1.98) 

56.56(  2.21) 

-18.09(  2.96)* 

34.28(  2.69) 

47.85(  7.91) 

-13.57(  8.36> 

32.38(  3.77) 

47.95(  4.38) 

-15.57(  5.78)* 


26.47(  5.16) 
48.03(  4.30) 
-21.56(  6.71)* 

33.55(  2.25) 
47.33(  3.55) 
-13.78(  4.20)* 

44.66(  2.55) 
62.24(  2.78) 
-17.58(  3.77)* 

34.35(  2.51) 
53.14(  2.61) 
-18.80(  3.62)* 


*    Significant  difference  between  BTB  and  Pace  (Alpha  =  .05) 


ERIC 


414 


433 


Table  11.2(4) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  9  Primary  Trait  Score  -  "Hole  in  the  Box" 


SIZE/TYPE  OF  COMMUNITY 

Disadvantaged  Urban       BIB  276 

! PACED  205 
DIFFER 

Advantaged  Urban  BIB  232 

! PACED  90 
DIFFER 

GRADE 

<  Modal  Grade 

At  Modal  Grade 

>  Modal  Grade 


BIB  635 

PACED  433 
DIFFER 

BIB  1386 

PACED  907 
DIFFER 

!BIB  8 

! PACED  4 
DIFFER 


MEAN 


1.18(  0.06) 
1.58(  0.08) 
-0.40(  0.10)* 

1.46(  0.05) 
1.77(  0.05) 
-0.31(  0.08)* 


1.06(  0.04) 
1.40(  0.04) 
-0.34(  0.05)* 

1.38(  0.02) 
1.62(  0.03) 
-0.24(  0.04)* 

1.77(  0.26) 
2.00(  0.00) 
-0.23(  0.26) 


X  >=  2 


35.58(  4.03) 
60.06(  4.15) 
-24.47(  5.78)* 

48.13(  4.62) 
71.26(  3.03) 
-23.14(  5.53)* 


21.93(  2.76) 
45.27(  2.78) 
-2j.34(  3.91)* 

42.08(  1.72) 
58.83(  2.33) 
-16.75(  2.89)* 

76.74(26.37) 
100. 00(  0.0  ) 
-23.26(26.37) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 
!    Interpret  with  caution— standard  errors  aie>  poorly  estimated, 


415 


Table  11.2(5) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  13  Primary  Trait  Score  -  "Split  Session" 


N 

MEAN 

X  > 

=  2 

—  TOTAL  ~ 

BIB 

PACED 

DIFFER 

2241 
1276 

1.37( 
1.43( 
-0.06( 

0.02) 
0.02) 
0.03) 

31.77( 
34.08( 
-2.31( 

1.10) 
1.61) 
1.95) 

ETHNICITY/RACE 

White 

BIB 

PACED 

DIFFER 

1631 
889 

1.42( 
1 .48( 
-0.06.' 

0.02) 
0.03) 
0.03) 

34.59( 
36.56( 
-1.98( 

1.13) 
1.62) 
1.98) 

Black 

BIB 

PACED 

DIFFER 

293 
211 

1.26( 
1 . 30( 
-0.04( 

0.06) 
0.06) 
0.08) 

24.03( 
28.12( 
-4.09( 

3.17) 
4.37) 
5.40) 

Hispanic 

BIB 
"ACED 
.  -FFER 

264 
126 

1.18( 
1.21( 
-0.03( 

0.04) 
0,06) 
0.07) 

21.07( 
20.41( 
0.67( 

2.61) 
6.70) 
7.19) 

PARENTAL  EDUCATION 

Not  graduated  H.S. 

BIB 

PACED 

DIFFER 

208 
92 

1.18( 
1.23( 
-0.05( 

0.05) 
0.05) 
:  0.07) 

22.52( 
23.45( 
-0.93( 

3.58) 
5.02) 
6.17) 

Graduated  H.S. 

BIB 

PACED 

DIFFER 

784 
451 

1.32( 
1.39( 
-0.07( 

:  0.03) 
:  0.03) 
:  0.04) 

28.16( 
31.67( 
-3.51( 

1.87) 
2.39) 
3.04) 

Post  H.S. 

BIB 

PACED 

DIFFER 

1003 
574 

1.49( 
1.54( 
-0.04( 

:  0.03) 
:  0.04) 

;  0.04) 

38.86( 
40.53( 
-1.67( 

1.78) 
2.10) 
2.75) 

Unknown 

BIB 

PACED 

DIFFER 

226 
130 

1.19( 
1.17( 
0.02( 

;  0.05) 
;  0.06) 
:  0.07) 

22.16( 
18.08( 
4.08( 

3.23) 
3.79) 
4.98) 

*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


416 


ERIC 


433 


Table  11.2(5) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  13  Primary  Trait  Score  -  "Split  Session" 


SIZE/TYPE  OF  COMMUNITY 
Disadvantaged  Urban 

Advantaged  Urban 

GRADE 

<  Modal  Grade 

At  Modal  Gra'»p 

>  Modal  Grade 


BIB  229 
! PACED  141 
DIFFER 


!BIB 
! PACED 
DIFFER 


264 
81 


BIB  655 

PACED  393 
DIFFER 

BIB  1579 

PACED  882 
DIFFER 

!BIB  7 

! PACED  1 
DIFFER 


MEAN 


1.21(  0.06) 
1.20(  0.12) 
0.02(  0.14) 

1.47(  0.03) 
1.55(  0.06) 
-0.09(  0.07) 


1.18(  0.03) 
1.27(  0.03) 
-0.10(  0.04)* 

1.46(  0.02) 
1.50(  0.03) 
-0.04(  0.04) 

1.21(  0.13) 
1.00(  0.0  ) 
0.21(  0.13) 


22.95(  4.11) 

20.06(  8.38) 

2.89(  9.34) 

37.34(  2.43) 

42.02(  4.86) 

-4.67(  5.43) 


20.76(  2.04) 
24.52(  2.11) 
-3.76(  2.94) 

36.62(  1.44) 
38.37(  2.01) 
-1.75(  2.47) 

20.93(13.22) 
0.0  (  0.0  ) 
20.93(13.22) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution— standard  errors  arc  poorly  estimated. 


417 


Table  11.2(6) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  13  Primary  Trait  Score  -  "Dali" 


MEAN 


7.  >=  1 


—  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCA'ilON 
Not  graduated  H.S. 

Graduate<!  H.S. 

Post  H.S. 

Unknown 


BIB  1890 
PACED  1276 
DIFFER 


BIB  1425 
PACED  889 
DIFFER 


BIB 

PACED 

DIFFER 


216 
211 


BIB  195 
PACED  126 
DIFFER 


1.90(  0.02) 
2.01(  0.02) 
-0.12(  0.03)* 


1.96(  0.03) 
2.09(  0.03) 
-a.l4(  0.04)* 

1.61(  0.05) 
1.72(  0.03) 
-0.12(  0.06)* 

1.68(  0.06) 
1.76(  0.06) 
-0.08(  0.09) 


BIB  149  1.68(  0.08) 

PACED  92  1.75(  0.06) 

DIFFER  -0.08(  0.10) 

B.B  659  1.81(  0.03) 

PACED  451  1.95(  0.03) 

DIFiCK  -0.13(  0.04)* 

BIB  890  2.0<(  0.03) 

PACEL  57^'.  2.12(  0.03) 

DIFFER  -0.09»'  0.04)* 

BIB  177  i.65(  0.04) 

PACED  130  1.8C(  0.06) 

DIFFER  -0.15(  0.07)* 


71.67(  1.21) 
81.40(  1.20) 
-9.73(  1.70)* 


75.36(  1.55) 
84.94(  1.37) 
-9.57(  2.07)* 

54.80(  i.45) 
67.94(  3.42) 
-13.14(  4.86)* 

59.08(  3.11) 
72.33(  4.75) 
-13.24(  5.68)* 


62.60(  5.73) 
74.98(  5.51) 
-12.38(  7.95) 

68.49(  2.22) 
80.29(  1.83) 
-11.80(  2.88)* 

77.70(  1.43) 
84.45(  1.75) 
-6.74(  2.26)* 

58.48(  4.42) 
74.39(  '♦.62) 
-15.90(  6.39)* 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


418 


137 


Table  11.2(6) 
(continued) 

Effect  of  Mode  of  Administi'ation  c.  v/riting  Performance 
Age  13  Primary  Trait  Score  -  "Dali" 


SIZE/TYPE  OF  COMMUNITY 
Disadvantaged  Urban 

Advantaged  Urban 

GRADE 

<  Modal  Grade 

At  Modal  Grade 

>  Modal  Grade 


BIB 
! PACED 
DIFFER 

!BIB 
! PACED 
DIFFER 


BIB 

PACED 

DIFFER 


N 


167 
141 


225 
81 


545 
393 


BIB  1336 

PACED  882 
DIFFER 

!BIB  9 

! PACED  1 
DIFFER 


MEAN 


1.67(  0.06) 

1.66(  0.09) 

0.01(  0.11) 

2.27(  0.08) 

2.25(  0.06) 

0.01(  0.11) 


1.67(  0.03) 
1.85(  0.04) 
-0.17(  0.05)* 

1.99(  0.02) 
2.09(  0.03) 
-0.10(  0.04)* 

2.47(  0.33) 
3.00(  0.00) 
-0.53(  0.33) 


%  >=  2 


58.83(  4.19) 

63.56(  7.64) 

-4.73(  8.72) 

87.10(  2.36) 

87.74(  2.28) 

-0.64(  3.28) 


59.96(  2.20) 

78.35(  2.43) 

-18.39(  3.28)* 

76.42(  1.38) 

82.74(  1.55) 

-6.32(  2.07)* 

95.48(  4.89) 

100. 00(  0.0  ) 

-4.52(  4.89) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution  .-standard  errors  are  poorly  estimated. 


419 


438 


Table  11.2(7) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  13  Primary  Trait  Score  -  "Hole  in  the  Box" 


N 


MEAN 


>=  2 


—  TOTAL  ~ 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


BIB  2290 
PACED  1289 
DIFFER 


BIB  1640 

PACED  915 
DIFFER 

BIB  320 

PACED  160 
DIFFER 

BIB  248 

PACED  178 
DIFFER 


BTD 

PACED 

DIFFER 

BIB 

PACED 

DIFFER 


183 
114 


831 
470 


BIB  1012 
PACED  567 
DIFFER 

BIB  233 
PACED  130 
DIFFER 


1.74(  0.02) 
1.84(  0.04) 
-0.09(  0.04)* 


1.81(  0.02) 
1.83(  0.04) 
-0.02(  0.05) 

1.48(  0.07) 
1.92(  0.06) 
-0.43(  0.09)* 

1.55(  0.09) 
1.73(  0.06) 
-0.18(  0.10) 


1.64(  0.05) 
1.85(  0.10) 
-0.21(  0.11) 

1.72(  0.03) 
1.74(  0.05) 
-0.02(  0.06) 

1.85(  G.03) 
1.95(  0.05) 
-0.10(  0.05) 

1.42(  C.07) 
1.71(  0.07) 
-0.29(  0.10)* 


60.31(  1.43) 
66.68(  2.15) 
-6.37(  2.58)* 


6?  33(  1.55) 
6b  /3(  2.35) 
-2.39(  2.82) 

48.65(  3.89) 
74.23(  3.26) 
-25.57(  5.08)* 

52.69(  4.65) 
62.74(  6.17) 
-10.04(  7.73) 


55.62(  3.61) 
62.83(  6.67) 
-7.21(  7.58) 

60.49(  1.65) 
63.06(  3.13) 
-2.57(  3.53) 

63.80(  1.73) 
72.19(  2.31) 
-8.39(  2.88)* 

45.59(  4.52) 
59.28(  4.21) 
-13.69(  6.18) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


420 


ERIC 


43j 


Table  11.2(7) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  13  Primary  Trait  Score  -  "Hole  in  the  Box" 


SIZE/TYPE  OF  COMMUNITY 
Disadvantaged  Urban 

Advantaged  Urban 

GRADE 

<  Modal  Grade 

At  Modal  Grade 

>  Modal  Grade 


BIB 
! PACED 
DIFFER 

!BIB 
! PACED 
DIFFER 


BIB 

PACED 
DIFFER 


N 


229 
113 


254 
123 


736 
431 


BIB  1548 

PACED  854 
DIFFER 

!BIB  6 

I  PACED  4 
DIFFER 


MEAN 


1.51(  0.07) 
1.91(  0.05) 
-O.AO(  0.09)" 

2.16(  0.08) 
2.04(  0.18) 
0.11(  0.20) 


1.50(  0.04) 
1.72(  0.06) 
-0.21(  0.07)* 

1.8^.(  0.02) 
1.89(  0.04) 
-0.04(  0.05) 

1.93(  0.68) 
2.71(  0.19) 
-0.78(  0.71) 


%  >=  2 


44.81(  4.24) 
67.26(  4.24) 
-22.45(  5.99)* 

81.72(  3.23) 
69.19(  8.71) 
12.53(  9.29) 


49.06(  2.67) 
58.98(  4.09) 
-9.92(  4.88) 

65.89(  1.65) 
70.53(  2.18) 
-4.64(  2.73) 

47.29(23.09) 
100. 00(  0.0  ) 
-52.71(23.09)* 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution— standard  errors  are  poorly  estimated. 


421 

4  iU 


Table  11.2(8) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  17  Primary  Trait  Score  -  "Split  Session" 


N 


MEAN 


%  >=  2 


—  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


BIB  2382 
PACED  1540 
DIFFER 


BIB  1705 
PACED  1079 
DIFFER 

BIB  370 
PACED  242 
DIFFER 

BIB  236 
PACED  163 
DIFFER 


BIB  313 

PACED  194 
DIFFER 

BIB  833 

PACED  558 
DIFFER 

BIB  1146 

PACED  696 
DIFFER 


!BIB 
! PACED 
DIFFER 


74 

54 


1.71(  0.01) 
1.82(  0.04) 
-0.11(  0.04)* 


1.77(  0.02) 
1.87(  0.05) 
-0.10(  0.05) 

1.48(  0.04) 
1.68(  0.05) 
-0.20(  0.07)* 

1.59(  0.07) 
1.68(  0.08) 
-0.09(  0.10) 


1.53(  0.05) 
1.74(  0.05) 
-0.22(  0.07)* 

1.71(  0.02) 
1.76(  0.03) 
-0.05(  0.04) 

1.78(  0.02) 
1.9H  0.08) 
-0.13(  0.08) 

1.28(  0.08) 
1.29(  0.08) 
-0.01(  0.11) 


59.70(  0.83) 
63.79(  2.54) 
-4.09(  2.67) 


62.61(  1.36) 

66.95(  2.92) 

-4.34(  3.22) 

48.65(  2.92) 

54.73(  3.15) 

-6.09(  4.29) 

53.49(  5.07) 

57.19(  5.99) 

-3.70(  7.85) 


48.63(  3.67) 
63.49(  3.42) 
-14.86(  5.01)* 

61.08(  1.61) 
61.38(  2.13) 
-0.30(  2.67) 

63.04(  1.12) 
67.21(  4.44) 
-4.17(  4.58) 

34.71(  5.08) 
28.90(  6.39) 
5.81(  8.17) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 
!  Interpret  with  caution — standard  errors  are  poorly  estimated 


422 


ERIC 


441 


Table  11.2(8) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  17  Primary  Trait  Score  -  "Split  Session" 


SIZE/TYPE  OF  COMMUNITY 


N 


MEAN 


%  >=  2 


Disadvantaged  Urban 

!BIB 
! PACED 
DIFFER 

254 
181 

1.51( 
1.67( 
-0.16( 

0.04) 
0.06) 
0.07)* 

52.99(  3.11) 
55.73(  3.15) 
-2.74(  4.42) 

Advantaged  Urban 

BIB 

PACED 

DIFFER 

289 
190 

1.  79( 
1.85( 
-0.07( 

0.05) 
0.21) 
0.22) 

61.54^  3.56) 
62.31(13.19) 
-0.77(13.66) 

GRADE 

<  Modal  Grade 

BIB 

PACED 

DIFFER 

439 
310 

1.53( 
1.55( 
-0.02( 

0.03) 
0.05) 
0.06) 

50.39(  2.08) 
49.55(  4.06) 
0.84(  4.56) 

At  Modal  Grade 

BIB 

PACED 

DIFFER 

1748 
1097 

1.74( 
1.89( 
-0.15( 

0.02) 
0.05) 
0.05)* 

61.54(  0.99) 
67.38(  2.75) 
-5.84(  2.92) 

>  Modal  Grade 

BIB 

PACED 

DIFFER 

195 

133 

1.89( 
1.85( 
0.04( 

0.08) 
0.09) 
0.11) 

68.85(  3.94) 
65.09(  5.63) 
3.77(  6.88) 

*    Significant  difference  between  BIB 

and  Pace 

(Alpha  = 

.05) 

!    Interpret  with  caution — standard  errors  arj  poorly  estimated. 


423 


Table  11.2(9) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  17  Primary  Trait  Score  -  "Dali" 


N 


MEAN 


%  >=  2 


—  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


BIB  2282 
PACED  15A0 
DIFFER 


BIB  1706 
PACED  1079 
DIFFER 

BIB  28A 
PACED  242 
DIFFER 


BIB 

PACED 

DIFFER 


217 
163 


BIB  275 

PACED  194 
DIFFER 

BIB  799 

PACED  558 
DIFFER 

BIB  1116 

PACED  696 
DIFFER 


BIB 

PACED 

DIFFER 


68 
54 


2.13(  0.02) 
2.28(  0.04) 
-0.14(  0.05)* 


2.21(  0.03) 
2.35(  0.05) 
-0.14(  0.05)* 

1.82(  0.05) 
1.94(  0.05) 
-0.12(  0.07) 

1.86(  0.06) 
2.17(  0.07) 
-0.31(  0.09)* 


1.89(  0.05) 
2.16(  0.07) 
-0.27(  0.08)* 

2.00(  0.03) 
2.25(  0.05) 
-0.25(  0.06)* 

2.30(  O.04) 
2.37(  0.07) 
-0.07(  0.08) 

1.63(  0.11) 
1.78(  0.13) 
-0.15(  0.17) 


81.95(  1.05) 
88.99(  1.19) 
-7.04(  1.58)* 


84.98(  1.20) 

90.95(  1.30) 

-5.98(  1.76)* 

68.80(  2.62) 

79.71(  2.99) 

-10.90(  3.98)* 

73.87(  3.50) 

88.78(  2.09) 

-14.91(  4.08)* 


75.32(  3.10) 
87.36(  2.89) 
-12.04(  4.24)* 

76.90(  1.46) 
90.02(  1.83) 
-13.12(  2.34)* 

87.93(  1.38) 
90.21(  1.68) 
-2.28(  2.17) 

63.90(  5.19) 
74.72(  6.84) 
-10.82(  8.59) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 


424 


Table  11.2(9) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 


Age 

17  Primary  Trait 

Score  - 

"Dali" 

N 

MEAN 

X  > 

=  2 

SIZE/lfPE  OF  COMMUNITY 

Disadvantaged  Urban 

!BIB 
! PACED 
DIFFER 

255 
181 

1.84( 
1.89( 
-0.05( 

0.04) 
0.11) 
0.12) 

70.94( 
74.00( 
-3.06( 

3.03) 
3.99) 
5.01) 

Advantaged  Urban 

BIB 

PACED 

DIFFER 

293 
190 

2.25( 
2 . 36( 
-0.11( 

0.10) 
0. 15) 
0.18) 

85.47( 
88.65( 
-3.18( 

3.74) 
2.21) 
4.35) 

GRADE 

<  Modal  Grade 

BIB 

PACED 

DIFFER 

391 
310 

1.8a 
2.04( 
-0.20( 

0.05) 
0.06) 
0.08)* 

69.67( 
82.35( 
-12.68( 

2.45) 
2.48) 
3.48)* 

At  Modal  Grade 

BIB 

PACED 

DIFFER 

1703 
1097 

2.19( 
2.34( 
-0.15( 

0.03) 
0.05) 
0.05)* 

84.33( 
91.38( 
-7.04( 

1.13) 
1.22) 
1.66)* 

>  Modal  Grade 

BIB 

PACED 

DIFFER 

188 

133 

2.34( 
2.24( 
0.10( 

0.04) 
0.08) 
0.09) 

91.56( 
83.90( 
7.66( 

1.45) 
4.27) 
4.51) 

*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 
I    Interpret  with  caution— standard  errors  are  poorly  estimated. 


425 

4  4  4 

o 

ERIC 


Table  11.2(10) 
Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  17  Primary  Trait  Score  -  "Hole  in  the  Box" 


N 


MEAN 


X  >=  2 


—  TOTAL  — 

ETHNICITY/RACE 
White 

Black 

Hispanic 

PARENTAL  EDUCATION 
Not  graduated  H.S. 

Graduated  H.S. 

Post  H.S. 

Unknown 


BIB  2416 
PACED  1534 
DIFFER 


BIB  1750 
PACED  1130 
DIFFER 


BIB 

PACED 

DIFFER 


377 
193 


BIB  224 
PACED  172 
DIFFER 


BIB  301 

PACED  161 
DIFFER 

BIB  853 

PACED  543 
DIFFER 

BIB  1148 

PACED  775 
DIFFER 


!BIB 
! PACED 
DIFFER 


87 
52 


1.81(  0.03) 
1.98(  0.04) 
-0.17(  0.05)* 


1.89(  0.03) 
2.02(  0.04) 
-0.13(  0.06)* 

1.53(  0.04) 
1.88(  0.08) 
-0.35(  0.09)* 

1.59(  0.08) 
1.88(  0.10) 
-0.29(  0.12)* 


1.63(  0.04) 
1.92(  0.08) 
-0.29(  0.09)* 

1.74(  0.03) 
1.92(  0.05) 
-0.18(  0.06)* 

1.95(  0.04) 
2.08(  0.04) 
-0.13(  0.06)* 

1.26(  0.08) 
1.33(  0.15) 
-0.08(  0.17) 


66.48v  1.41) 
75.13(  2.05) 
-8.65(  2.49)* 


69.87(  1.70) 
76.57(  2.27) 
-6.70(  2.84)* 

54.95(  2.16) 
69.65(  4.30) 
-14.70(  4.81)* 

57.06(  4.21) 
71.15(  4.57/ 
-14.09(  6.21)* 


60.72(  2.71) 
72.36(  4.28) 
-11.64(  5.06)* 

63.22(  1.62) 
72.43(  2.56) 
-9.21(  3.03)* 

71.86(  2.06) 
79.44(  1.86) 
-7.58(  2.78)* 

43.47(  5.79) 
47.97(  7.34) 
-4.50(  9.35) 


*    Significant  difference  between  BIB  and  Pace  (Alpha  =  .05) 

!    Interpret  with  caution— standard  errors  are  poorly  estimated. 


426 


Table  11.2(10) 
(continued) 

Effect  of  Mode  of  Administration  on  Writing  Performance 
Age  17  Primary  Trait  Score  -  "Hole  in  the  Box" 


N 


MEAN 


%  >=  2 


SIZE/TYPE  OF  COMMUNITY 


Disadvantaged  Urban 

!BIB 
! PACED 
DIFFER 

241 
179 

1  .54( 
1.82( 
-0.:8( 

0.05) 
0.10) 
0.12)* 

56.84( 
72.02( 
-15.18( 

3.49) 
4.83) 
5.95)* 

/iavanLagea  urban 

BIB 
! PACED 
DIFFER 

313 
221 

2.10( 
2.05( 
0.05( 

0.09) 
0.18) 
0.20) 

78.12( 
74.30( 
3.82( 

3.21) 
8.32) 
8.92) 

GRADE 

<  Modal  Grade 

BIB 

PACED 

DIFFER 

427 
253 

1.57( 
1.68( 
-0.11( 

0.05) 
0.08) 
0.10) 

56.60( 
60.69( 
-4.09( 

3.28) 
4.47) 
5.55) 

At  Modal  Grade 

BIB 

PACED 

DIFFER 

1780 
1170 

1.86( 
2.04( 
-0.18( 

0.03) 
0.04) 
0.05)* 

68.87( 
77.49( 
-8.62( 

1.53) 
2.07) 
2.58)* 

>  Modal  Grade 

BIB 

PACED 

DIFFER 

209 
111 

2.03( 
2.10( 
-0.07( 

0.08) 
0.05) 
0.09) 

71.23( 
82.38( 
-11.15( 

3.50) 
3.70; 
5.10) 

*    Significant  differen 

ce  between  BIB 

and  Pace 

(Alpha 

=  .05) 

!    Interpret  with  caution--standard  errors  are 

poorly 

estimated. 

427 

•1  4  {) 

O 

ERIC 


Figure  11.2-1 
AGE  9 

DIFFERENCE  BETWEEN  BIB  AND  PACE  PERCENTAGES 

RECEIVING  A  SCORE  GREATER  THAN  OR  EOUAL  TO  2 


428 


ERIC 


447 


Figure  11.2-2 

AGE  13 

DIFFERENCE  BETWEEN  BIB  AND  PACE  PERCENTAGES 

RECEIVING  A  SCORE  GREATER  THAN  OR  EQUAL  TO  2 


40 


20 


▲  SPLIT 

+  DALI 
O  HOLE 


o 


-20 


-40 


-60  -H 


429 


ERIC 


446 


Figure  11.2-3 

AGE  17 

DIFFERENCE  BETWEEN  BIB  AND  PACE  PERCENTAGES 

RECEIVING  A  SCORE  GREATER  THAN  OR  EQUAL  TO  2 


10 


5  - 


0  —I 


-5 


^10  - 


^15  - 


A  SPLIT 

+    DALI  A 

O    HOLE  O 


 A  — 

A  A       ^  A  o  O 


A 

O  + 

Co  o 

+     ^        +  o 


o  $ 


+ 


A  o 


-^o-" — I — I — I — I — I — I — I — I — I — I — I — I — r 

430 


Chapter  11.3 
ESTIMATION  OP  TRENDS  IN  WRITING  ACBIEVEMENT 

Eugene  G.  Johnson 
Educational  Testing  Service 


Chapter  11.2  noted  that  there  appears  to  be  a  differential  effect  of 
mode  of  administration  on  the  estimation  of  writing  achievement.  In 
particular,  writing  exercises  administered  using  the  paced  tape  procedures, 
where  the  instructions  are  read  aloud  to  the  students,  tend  to  be  less 
difficult  for  students  than  the  BIB  spiralled  administrations,  where  the 
students  are  required  to  read  and  understand  the  instructions. 
Furthermore,  the  reading  of  the  writing  assignment  in  a  paced  tape 
administration  appears  to  be  of  more  benefit  for  some  subgroups  of  students 
than  for  others,  where  the  amount  of  benefit  depends  on  the  item.  This 
differential  benefit  makes  the  adjustment  of  scores  from  BIE  spiralled 
administration  to  correspond  to  scores  from  a  paced  tape  administration 
difficult,  since  a  different  adjustment  may  be  required  for  each  subgroup. 

Most  of  the  Year  15  writing  assessment  employed  BIB  spiralled 
administration  of  writing  exercises,  in  contrast  with  previous  assessments 
which  used  only  paced  tape  procedures.    Consequently,  measurements  of 
trends  in  writing  achievement  over  time,  using  the  results  from  the  BIB 
spiralled  assessment  (possibly  adjusted  for  the  effect  of  mode  of 
administration),  will  be  confounded  by  the  effects  of  the  different  mode  of 
administration  in  Year  15  as  opposed  to  the  previous  assessments.  The 
degree  of  this  confounding  depends  on  the  subgroup  considered  and  the 
success  of  the  adjustment. 

To  eliminate  the  confounded  effects  of  mode  of  administration  on  the 
estimates  of  writing  achievement,  the  statistics  used  to  report  trends  over 
time  are  not  based  on  the  full  Year  15  NAEP  writing  assessment,  but  are 
limited  to  the  data  obtained  from  the  subset  of  writing  tasks  at  each  age 
that  were  included  in  the  booklets  administered  in  accordance  with  the 
paced  tape  procedure,  in  exactly  the  same  manner  as  in  past  writing 
assignments. 

Although  the  need  for  overlapping  procedures  and  analyses  designed  to 
link  the  two  methods  had  been  anticipated  by  NAEP  staff,  only  about  half  of 
the  previously  administered  writing  items  (and  therefore  only  about 
one  fifth  of  all  the  writing  items  included  at  each  age/grade  level  in  the 
full  Year  15  writing  assessment)  were  selected  foi  dual  assessment, 
appearing  in  both  the  primary  BIB  spiralled  assessment  and  in  the  much 
smaller  paced  tape  assessment.    The  trend  results  presented  in  the  report 


431 


Writing:  Trends  Across  the  Decade,  1974-1984  (Applebee,  Langer,  &  Mullis, 
l^e6a)  are  based  upon  this  limited  selection  of  writing  items  administered 
at  each  age,  and  generalizations  based  on  the  results  should  be  viewed  with 
caution,  particularly  when  they  pertain  to  one  type  of  writing  at  one  age 
level . 

These  writing  ite.as  span  different  periods  in  NAEP's  history.    One  of 
the  items  was  included  in  both  the  Year  5  and  the  Year  10  writing 
assessments  and  two  of  these  were  included  in  the  previous  assessments  as 
well  as  in  the  Year  15  assessment,  thereby  enabling  comparisons  in  student 
performance  to  be  made  across  ten  years  (Year  5  to  Year  10  to  Year  15)  or 
across  five  years  (Year  10  to  Year  15). 

To  provide  a  fuller  perspective  on  trerus  in  writing  proficiency  during 
the  last  ten  years,  we  have  reported  the  newly  analyzed  trend  information 
in  the  context  of  the  trend  data  for  those  items  collected  durin„  the 
earlier  five-year  time  span  (Year  5  to  Year  10)  and  reported  by  the 
Education  Commission  of  the  States  (1980).    The  complete  set  of  trend 
results  is  based  only  on  comparisons  of  identical  writing  tasks 
administered  in  the  same  way  in  at  least  two  assessments.    All  responses  to 
each  task  from  all  assessment  administrations  vere  evaluated  at  the  same 
time  by  the  same  readers. 

Tha  data  linking  back  to  the  first  writing  assessment  (Year  1)  wpre 
minimal:    one  single  national  sample  (about  2,500  papers)  on  one 
imaginative  writing  task  rated  u  Irg  the  primary  trait  method  at  each  age 
level,  and  one  national  subsariple  (about  400  papers)  on  a  different  task  at 
ea'-h  age  level  rated  holisiically .    GWen  these  limited  data  and  the  fact 
that  any  subgroup  trends  from  Year  1  to  Year  5  would  be  based  on  only  one 
Imaginative  writing  task,  the  writing  report  is  limited  to  trends  over  the 
last  decade  based  on  changes  between  the  Y3ar  5,  Year  10  and  Year  15 
assessments.    The  full  set  of  writing  items  used  is  summarized  in  Table 
11.3(1). 


432 


Table  11.3(1) 

Exercises  Used  to  Estimate  Trends  in  Writing  Performance 


mFORNATIVE 

Dali  (description) 

Electric  Blanket  (business  letter) 

Describe  (description) 

PERSUASIVE 

Aunt  May^  (letter) 

Split  Session^  (letter) 
Puppy  Letter  (letter) 
Principal  (letter) 
Recreation  Center  (written  speech) 
IMAGINATIVE 

Hole  in  the  Box^  (description) 

Goldfish  (description) 
Loss  (description) 
Fireflies  (narration) 
Kangaroo  (narration) 
Rainy  Day  (narration) 
Stork  (narration) 
Grape  Peeler  (satire-humor) 
BACKGROUND  QUESTIONS 


Scoring 
Method^ 

P,  a 
p 

H 

P,  H 
P,  H 
P 
P 
P 

P,  H 

P 

P 

P 

H 

P 

P 

P 


-1974  

13  17 


-  2276 
420  417 


Number  in  Sample 

 1979  

9         13  17 


2482  2496 


536 


2433 
2781 
539 


 1984  

9        13  17 

1351    1275  1539 


2643 


2525 


2494 


1386 


2735  2742 


1276  1540 


2552 


2793 


2308 


2784 


2543    2513    2246    2464    2782    2688    1344    1289  1534 


2611 

2573 
409 


2607 


2621 


2281 
2283 
2237 


2475 

-  2775 
2553 

494 

-  2804 

-  2748 

-  2765 

-  29430  26631 


5158  6209 


■P  =  Primary  Trait,  H  =  Holistic 

Wysis  performed  by  ETS  in  conjunction  with  analysis  of  the  Year  15  writing  assessment  results 


433 


ERIC 


452 


Chapter  11,4 
THE  AVERAGE  RESPONSE  METHOD  (ARM)  OF  SCALING^ 


Albert  E.  Beatc.i 
Eugene  G.  Johnson 

Educational  Testing  Service 


The  National  Assessment  of  Educational  Progress  (NAEP)  used  a  variant 
of  multiple  matrix  sarapling  called  BIB  spiralling  (Beaton,  1984)  in  its 
Year  15  assessment.  Multiple  matrix  sampling  allows  the  assessor  to 
administer  a  large  number  of  exercises  in  a  subject  area,  more  exercises 
than  would  be  prudent  to  ask  any  individual  student  to  perform.  BIB 
spiralling  has  the  additional  property  of  assuring  that  each  pair  of 
exercises  is  administered  to  a  randomly  equivalent  subsample  of  students. 
The  BIB  spiralling  was  imposed  on  an  already  complex  multi-staged  sample 
design.  In  sum,  the  NAEP  of  1983-84  contained  many  reading  and  writing 
exercises,  as  well  as  hundreds  of  questions  about  backgrounds,  attitudes, 
and  activities,  which  were  collected  on  a  sample  of  over  100,000  students 
in  this  nation's  schools. 

The  results  of  an  assessment  like  this  would  be  hard  to  integrate  and 
interpret  if  the  vast  array  of  information  were  presented  in  an 
exercise-by-exercise  manner.  NAEP  has  elected  to  summarize  the  available 
information  by  developing  scales  which  encapsulate  much  of  the  information 
available  in  the  exercise  responses.  Separate  scales  have  been  developed 
for  the  reading  and  writing  exercises.  It  is  the  purpose  of  this  chapter  to 
describe  the  rationale  and  properties  of  the  writing  scale. 

The  properties  of  the  reading  scale  have  been  reported  in  Chapter  10.5. 
The  technology  of  the  reading  scale  was  not  appropriate  for  the  writing 
scale.  For  reading,  there  were  a  large  number  of  exercises,  of  which  228 
were  used  in  the  scale,  and  the  individual  exercises  could  be  scored  as 
right  or  wrong,  so  item  response  theoretic  (TRT)  methodology  could  be 
adapted  for  the  scale.  For  writing,  there  were  only  22  exercises,  of  which 
only  ten  were  useful  for  the  writing  scale,  and  the  individual  exercises 
wtre  graded  on  a  zero  to  four  scale,  so  standard  IRT  methodology  was  not 
appropriate.  Several  attempts  have  been  made  to  adapt  IRT  technology  to 
these  non-binary  writing  exercise  responses,  but  these  efforts  have  not 
proved  fruitful  at  this  time. 


The  statistical  programming  for  this  section  was  provided  by  Bruce 
kaplan,  David  Freund,  and  Laurel  Barnett.    Th3  figures  were  produced  by  Ira 
Sample. 


435 


Both  the  reading  and  writing  portions  of  the  assessment  do  have 
important  features  in  common.  Perhaps  most  important  here  is  that  the 
information  available  about  most  students  is  sparse  so  that  the  scale 
scores  for  few,  if  any^  students  are  sufficiently  accurate  for  individual 
decision  making.  A  teacher  or  administrator  would  insist  on  a  more  reliable 
testy  that  is,  a  test  with  many  more  items  in  the  subject  area,  before 
using  the  test  for  making  decisions  which  would  affect  a  student's  academic 
career.  However,  a  national  assessment  does  not  report  individual  scores 
and  is  concerned  primarily  with  the  producing  national  and  regional 
parameter  estimates  and  measures  of  the  accuracy  of  estimation.  The 
unreliability  of  individual  scores  has  led  Nislevy  (198Sa)  to  expand 
Rubin's  (1978)  work  on  missing  data  to  assessments,  and  this  work  has  been 
incorporated  into  both  the  reading  and  writing  scale  construction  and 
analysis . 

The  writing  scale  is  defined  for  NAEP  as  the  average  of  a  subject's 
scores  on  ten  specific  essays  that  were  administered  to  NAEP  subjects. 
These  ten  essays  were  chosen  because  they  were  administered  at  more  than 
one  age  level  and  because  all  inter-correlations  among  them  were  estimable. 
The  (unobserved)  writing  scale  score  is  a  latent  variable,  since  no 
individual  student  actually  responded  to  more  than  four  essays  and  thus  the 
average  over  all  ten  essays  must  be  estimated.  The  result  of  the  scaling 
process  is  a  set  of  plausible  values  for  each  student  who  responded  to  at 
least  one  of  the  ten  essays;  each  plausible  value  is  a  different  estimate 
of  the  student's  unknown  writing  scale  score.  The  five  different  estimates 
for  each  student  are  values  from  the  conditionol  distribution  of  potential 
scores  for  the  student,  conditional  on  the  available  information,  and 
reflect  the  uncertainty  in  estimation. 

The  writing  scale  described  here  is  closely  related  to  an  estimation 
procedure  suggested  by  Goldstein  and  James  (1983).  Goldstein  and  James 
address  the  estimation  of  population  averages  of  test  scores  where  the 
scores  are  the  sum  of  item  responses,  and  such  estimation  is  the  primary 
concern  of  this  scaling  method  as  veil.  To  improve  the  estimates,  the  NAEP 
scaling  procedure  uses  other  available  information  in  the  estimation 
process.  The  writing  scale  also  results  in  che  plausible  values  which  may 
be  considered  as  partial  computations  that  can  be  used  for  estimating  other 
parameters.  Also,  the  partial  computations  are  useful  in  estimation  with  a 
complex  sample,  such  as  NAEP's.  Proper  use  of  the  plausible  values  allows 
for  an  accounting  of  the  uncertainty  due  to  incomplete  information  both  due 
to  the  sampling  of  individuals  from  the  population  and  due  to  the 
incomplete  information  on  each  sampled  individual.  However,  the  plausible 
values  may  result  in  biased  estimates  of  parameters  that  were  not  included 
in  the  scaling  process  (see  Sections  11.4.3  and  11.4.6). 

The  next  two  sections  of  this  chapter  will  develop  the  scaling  method. 
The  following  section  wil]  discuss  the  properties  of  the  plausible  values 
of  the  scale  score.  The  final  sections  will  discuss  the  specifics  of  the 
application  of  this  technique  to  NAEP  writing  data. 


436 


ERLC 


454 


11.4.1  Method 


As  mentioned  above,  the  writing  scale  score  is  the  average  of  a  set  of 
writing  exercises.  Let  us  assume  that  we  wish  to  estimate  the  average 
writing  score  for  some  group,  say,  males.    To  be  more  general,  we  will 
assume  that  we  wish  to  estimate  a  set  of  parameters  called  p.  p  may  be  the 
mean  of  any  subgroup,  a  set  of  means,  or  any  arbitrary  parameters  that  may 
contribute  to  or  be  related  to  performance  in  writing.  If  we  can  estimate 
P,  then  we  can  estimate  any  linear  combinations  of  3.  Let  us  be  explicit 
about  the  notation  and  assumptions.  Let 


be  an  Nxp  matrix  of  rank  p  for  the  writing  scores 
of  the  N  (i=l,2, . . . ,N)  subjects  on  the  p 
writing  essays.  The  values  z.^  (k  =  1,  2,  ...,p) 
will  be  known  for  those  who  were  administered  the 
k     exercise  and  unknown  otherwise. 

is  a  p-element  column  vector  of  known  constants. 
Although  a  may  contain  any  values,  we  will 
generally  assume  here  that  all  values  a^=l/p. 

be  an  Nx(m+1)  matrix  containing  the  values  of  the 
m  conditioning  variables  for  the  N  subjects. 
The  values  x^.  (j  =  0,  1,  2,...,m)  of  the 
conditioning  variables  are  assumed  known  for  all 
subjects.    The  zero     column  of  X  is  a  vector  of 
unities.    For  convenience,  we  will  use  m'=m+l.  For 
simplicity  here,  we  will  assume  that  X  is  of  rank 
m' . 


The  Nth-order  column  vector  Y  is  defined  as 

Y  =  Za     where  the  elements  of  Y,  y  ,  are  the  writing 

scale  scores.  The  exact  value  of  y.  will  not  be 
known  unless  a  subject  i  was  administered  all 
writing  tasks. 

We  will  assume  that  we  have  identified  the  complete  set  of  conditioning 
variables,  X,  and  that  the  effect  of  the  conditioning  variables  is  to  move 
the  centers  of  the  distribution  while  leaving  the  spread  alone  so  that,  to 
a  reasonable  degree  of  approximation, 

Z=XB+E     where  B  is  an  m'xp  matrix  of  unknown  constants 
and  E  is  an  Nxp  matrix  of  unknown  errors.  Also, 
we  assume  that  each  row  of  E  is  independently 
and  identically  distributed  as  N(0,  E). 

Consequently, 


Y=XBa+Ea=Xp+e 


437 


ERIC 


45o 


where  0=Ba  and  €=Ea.  It  follows  that 

e-N(0,a^)  where  a^=a'Ea. 

Although  some  of  the  values  of  Z  are  unknown,  the  BIB  spiralling 
procedure  produces  sufficient  information  to  estimate  the  mean  and  standard 
deviation  of  each  writing  score  and  also  the  correlation  between  each  pair 
of  scores.  Furthermore,  because  the  BIB  spiralling  procedure  presents  items 
and  pairs  of  items  to  randomly  equivalent  (i.e.,  representative — see 
Chapter  5)  subsamples,  estimates  of  means,  variances,  and  covariances, 
baseu  on  the  total  set  of  available  responses,  are  unbiased  for  the 
population  values. 

A  maximum  likelihood  estimate  of  the  cross-products  matrix  can  be 
computed  using  the  EM  algorithm  of  Dempster,  Laird,  and  Rubin  (1977).  After 
forming  the  matrix  V=|X|Z],  let 

C  =    the  maximum  likelihood  estimate  of  V'V 

where  the  cross-product  matrix  C  has  the  expected  value  (E) 


E(C) 


Z'X  Z'Z 


X'X 
B'X'X 


X' 
'X' 


XB 

XB+NE 


Using  C,  the  mean  and  variance  of  the  scale  score  y  can  be  estimated  as 
can  its  correlations  with  the  variables  in  V.    Consider  a  transformation 
matrix 


where  I  ,  and  I    are  appropriately  sized  identity  matrices.  If  V  were 
completely  knovn,  then  the  N  by  m'+p+l  matrix 

Vy  =      =  ix|z|y] 

would  contain  all  of  the  elements  of  V  as  well  as  a  column  containing  the 
scores      .  Using  C,  the  estimate  of  V^'V^  is 


C    =  T  'CT  = 
y       y  y 


X'X  X'Z  X'Y 
Z'X  Z'Z  Z'Y 
yX    Y'Z  Y'Y 


438 


where  X'Y,  Z'Y,  and  Y'Y  are  maximum  likelihood  estimates  of  the  sums  of 
squares  and  cross-products  of  Y  with  the  other  variables.  C    has  the 
expected  value  y 


X'X 
B'X'X 
P'X'X 


X'XB 
B'X'XB+NE 
P'X'XB 


X'Xa 
B'X'Xp 
P'X'Xp+Na^ 


The  matrix  C  can  be  used  to  estimate  a  missing  value  y,  .  Let  z  .  the 
1      row  of  Z,  be  partitioned    Zi  =  [z^Jz^J  ^  ^ 

where^z^    is  a  p/^-order  vector  containing  the  known  values  of  z  ,  z  is 
a  Pj    -order  vector  containing  the  unknown  values,  and  p  =  p    +  p  .  T^e 
known  inf jrmation  of  subject  i  can  be  encoded  in  the  vector  ^  ^ 


where  x    is  the  i'"  row  of  X.  Let  Z    be  an  Nxp,  matrix  of  the  vectors  z 
be  the  matrix  [X|ZJ,  and  ^ 


V/V^  V/Y 
Y'V,  Y'Y 


X'X  X'Z, 


Z^'X 


Y'X  Y'Z, 


X'Y 
Z^  'Y 
Y'Y 


be  the  rows  and  columns  of  C    corresponding  to  the  columns  in  V,  and  Y. 
Then,  the  regression  equation  for  estimating  y  from  V,  can  be  computed  by 
sweeping  (Beaton,  196A)  the  rows  and  columns  corresponding  to  V    with  the 
result  that  i 


C* 


-d' 


-1 


d 

c* 


yy 


where  d  is  a  m'  +  p^  order  column  vector.  The  elements  of  d  may  also  be 
written  as  two  subvectors,  d'  =  [c  '  |c  ']  where 


♦39 


457 


c    =  IX'(I-K,)X]-'X'(I-K  )Y 


=  IZ/(I-K^)ZJ-^Z/(I-KJY 


using  =  X(X'X)  4' 


and  K^=  Z^(Z/Z^)-'Z/  . 

Using  d  ai'd  the  vector  v^.,  it  is  possible  to  estimate  the  value  of 
for  subject  i  as 

Assuming  a  correctly  specified  model,  the  expected  average  value  of  y 
is  the  same  a's  the  expected  average  value  of  y  but  its  variance  is 
different  being 

var(y)=R'a'y 

where       is  the  multiple  correlation  of  y  on  X  and       and  a\  is  the 
variance  of  the  y.  about  their  mean.    Thus  var(y)  is  less  than  unless 
R    =  1,  which  would  indicate  that  the  values  of  y  were  perfectly 
predictable  from  the  known  information.  What  has  not  been  accounted  for  in 
the  use  of  y  for  the  prediction  of  y  is  the  fact  that  there  is  a 
distribution  of  potential  scores  for  any  individual  and  that  the  estimate 
of  y^  is,  under  normality  assumptions,  the  estimated  mean  of  the 
conditional  distribution  of  the  scores  y^  given  the  known  information  X  and 

.    As  such,  y.  makes  no  allowance  for  the  variability  of  the  potential 
scores  of  an  individual  about  the  conditional  mean. 

This  source  of  variability  can  be  accounted  for  by  estimating  the 
variability  of  the  residuals  from  the  predicted  values  y^ .  (There  is  also 
variability  in  the  prediction  of  y.  which  will  be  addressed  in  the  next 
section.)  An  estimate  of  the  variance  of  the  residuals  about  y^  is 
available  in  the  term  c*^^  which  is 

c*yy  =  (Y-V^d)'(Y-V^d), 

the  residual  sum  of  squares. 

A  plausible  value  of  y,  y,  say,  which  is      realization  frorr  a 
distribution  with  the  same  first  two  moments  as  the  distribution  of  y,  can 
be  formed  by  adding  a  random  normal  deviate  €  to  y  where  e  is  normally 
distributed  with  mean  0  and  variance  equal  to  the  residual  mean  square 
C*yy/(N-m'-p^ ).    A  plausible  value  for  the  respondent  is  then 

y    =  y     fe=XC    +  2     c    +  e. 
i         i  ix  1x2 


440 


cnor^S  !L  J?  sP^"lii"g'  different  subjects  will  be  missing  different 
n^?^orn    ^    ?         different  least  squares  equation  will  be  needed  for  each 
JJL?^?  °f  missing  data.  Under  the  assumption  that  the  complete  set  of 
conditioning  variables  has  been  identified  and  that  the  variability  is 
correctly  modeled,  the  distribution  of  the  plausible  values  will,  on  the 

S^v^ful  ^^^^  °f  y  ^^"^  so  that  each  y  value  will 

have  the  same  expected  mean  and  variance  as  the  corresponding  y. 

ll-^-2    Using  Plausible  Values 

^°™P"*J"e  plausible  values  provides  an  estimated  y  for  each  subject 
vISp  T^'  component  parts  are  missing.  Each 

tt  jlLl  of  ^f^'         assumptions,  and  is  useful  in  estimating 

the  values  of     or  linear  functions  of  p.  However,  u ung  the  values  9  in 
least  squares  analyses  as  if  they  were  exact  values  of  y  has  some 
limitations.  If  the  j^^  are  used  to  fit  a  model  of  the  form 

Y  =  Xb  +  e 

«^^HpfLl3         ""^T"^  plausible  values  9,,  the  matrix  X  is  the  same 

vLJo^  '^"^  i-egression  coefficient  vector,  and  e  is  an  error 

vector,  the  least  squares  estimate  of 


E  =  (X'X)'^X'Y 
is  an  unbiased  estimate  of  p  since 

E(E;)=(X'X)"^X'E(Y)  =  P 

(the  proof  is  in  the  next  section).    Thus,  Y  may  be  used  to  estimate 
functions  of  p  such  as  group  differences  if  group  membership  was  coded 
in  X  and  thus  used  in  the  creation  of  the  plausible  values. 

Computing  the  usual  estimate  of  the  error  in  regression  coefficients 

Var(e)  =  s^(X'X)-^ 

based  on  a  single  set  of  plausible  values  would  result  in  an  inaccurate 
accounting  of  the  uncertainty  involved  in  their  estimation  because  the 
uncertainty  in  the  measurement  of  the  individual  y  vaJues  has  not  been 
completely  accounted  for.  To  account  for  this,  Rubin  (1978)  has  suggested 


441 


ERIC 


459 


assessing  the  uncertainty  by  repeating  an  analysis  several  times,  each  time 
using  a  different  set  of  plausible  values. 

Given  the  model,  there  are  two  sources  of  uncertainty  reflected  in  a 
set  of  plausible  values.  The  first  is  the  uncertainty  in  a  student's  score 
and  is  measured  by  the  variability  about  the  conditional  mean  score.  This 
is  addressed  by  the  error  term  in  y . ,    The  other  source  of  uncertainty  is 
the  use  of  the  regression  equation  computed  from  ^he  matrix  C^,  If  the 
sample  size  is  very  large,  the  error  introduced  by  using  the  sample 
regression  equation  as  opposed  to  the  population  equation  can  be  considered 
trivial  and  ignored.  If  the  sampling  variances  and  covariances  are  not 
small  enough  to  ignore,  then  this  uncertainty  can  be  incorporated  into  the 
procedure  by  randomly  selecting  a  value  of  d  from  a  distribution  of 
plausible  d  values.  From  least  squares  theory,  under  the  above  assumptions, 


where  y  is  unknown,  a    can  be  estimated  from  the  data,  and  V^'V^  is  a 
matrix  containing  known  values  in  X  ani      •  The  variance  of  the  d  can  be 
expressed  by  a  triangular  matrix  T^^  such  that 

VI      VI  1  1^ 

Letting       be  a  m'  +  p^  vector  of  random  normal  (0,1)  numbers,  then 

where  the  vector  A    is  distributed  N(0,  a^(V^ '  V         .  To  incorporate  the 
uncertainty  into  the  model,  the  vector       can  be  added  to  the  l>est 
available  estimate  of  y  which  is  3. 

Rubin' recommendation,  as  applied  here,  is  to  generate  several  sets  of 
plausible  values,  forming  several  similar  data  sets,  and  then  estimating 
parameters  using  each  set  separately.  If  the  uncertainty  due  to  estimation 
of  C    is  to  be  included,  this  uncertainty  should  be  addressed  by  computing 
a  vector       for  each  of  the  sets  of  plausible  values  and  using  the  same 
vector  for  each  plausible  value  within  the  set.  Rubin  shows  that  the 
average  of  the  several  sets  of  parameter  estimates  is  an  unbiased  estimate 
of  the  parameter  and  that  the  variance  of  the  parameter  estimates  is  a 
component  of  uncertainty  which  should  be  added  to  the  uncertainty  due  to 
sampling. 


442 


ERLC 


ERIC 


The  Bias  of  the  Average  Response  Method  Estimator 
Due  to  Model  Mis-specification 

The  preceding  estimation  technique  produces  plausible  values  y  whose 
first  two  moments  match,  on  the  average,  those  of  the  true  (unobserved) 
''^^"uJ  y}'^''^^''  y  =         +  t  is  an  adequate  description.    This  section 
establishes  that  fact  and  investigates  the  properties  of  the  esti:aators  y 
and  y,  computed  as  above,  when  the  model  is  inadequate. 

Suppose  that,  rather  than  the  model  presented  in  Section  11,4.2,  a  more 
adequate  specification  is 

z  =  XB  +  ur  +  E 

where  Z,  X  and  B  are  as  before,  U  is  an  N  x  q  matrix  of  (potentially)  known 
constants,  r  is  a  m  x  p  matrix  of  unknown  parameters,  and  E  is  an  N  x  p 
matrix  of  errors,  each  row  independently  distributed  as  multivariate  normal 
with  means  0  and  variance  matrix  E. 

Then,  with  Y  =  Za  as  before,  the  model  for  Y  is 

Y  =  XP  +  Uy  +  e 

where  p  =  Ba,  y  =^Ta  and  e  =  Ea  is  multivariate  normal  with  zero  mean  and 
variance  matrix  a       with  a'=  a'Ea. 

As  before,  let  z.  be  the  i*"  row  of  Z,  corresponding  to  respondent  i, 
where       is  partitioned  as  [z^.  z,J  with  z,,  a  1  x  p.  vector  containing 
the  known  values  and  z^,  a  1  x  p^  vector  containing  the  unknown  values  of 

.th  Consider  the  estimation  of  a  value  of  y  based  only  on  z, ,  and  x  ,  the 
1      row  of  X,  and  ignoring  the  additional  information  in  u.  ,  the  i'"  row  of 

Proceeding  as  in  Section  11.4.2,  form  the  cross-product  matrix 


V^'  V^'  Y 
Y'  Y'  Y 


where  V    is  (X|Z  ].    From  this  obtain  the  estimated  value  for 
respondent  i  as 

y,=  Ix.  z^.  ]  (v^'v^)-'v^'Y. 

For  present  purposes,  a  more  convenient  (but  equivalent)  representa- 
tion of  y. is 


44: 


4b'i 


=       (X'X)"^  X'Y  +  2  .      (Z      'Z       y    V  Y 

''i  i  '  li;.  x^l.x      1.x'  I.X.: 

where 


Z       =  (I  -  X(X'X)'^X')Z    =  (I-K  )Z 

l.x        ^  ^        '  '    \        ^  x'l 


is  Z^  linearly  adjusted  by  X, 


Y       =  (I  -  K  )Y 

•  X  X 


is  Y  linearly  adjusted  by  X,  and 


is  the  i*^**  row  of  Z, 

1  .  K 


Let       and       be  the  columns  of  F  and  E  corresponding  to  Z^  so  that 


Z  =Ur+(I-K)E 

l.x  , X    1  ^  X '  1 


where  U  ^    =  (I  -  K^)  U     is  orthogonal  to  X. 


Similarly, 


Y  =Uy+(I-K)c 

.X  .X  X 

consists  of  the  part  of  the  vector  Y  which  is  orthogonal  to  the  column 
space  of  X. 

Write  p*  =  (X'X)"^  X'Y    and    a*  =  (Z    '  Z  Z    '  Y 

^  '  1  ^l.xl.x'  1:.  X  ,X 

SO  that 


y    =  X  0*  +  z  a* 

-'i  i'^  li.x  1 


AAA 


ir  the  sum  of  two  terms,  each  of  which  can  be  thought  of  as  a  predicted 
value. 

The  first  term,  x^^^,  is  the  least-squares  predictor  of  Y  from  X  and 
has  the  expectation 


x^(X'X)-'X'(X0  +  Uy)  =  x^P  +  x^(X'X)-'X'Uy 

where  x^(X'X)"^    X'U  is  the  projection  of  u    on  the  column  space  of  X. 
This   prediction  thus  accounts  for  all  information  in  Y  predictable  from 
linfe>*.  combinations  of  the  conditioning  variables  X  but  ignores  any 
information  in  the  part  of  the  subspace  of  U  which  is  orthogonal  to  X. 
that  is,  U    v.  or 

.  X  ' 

This  information  is  addressed  by  the  second  predictor  2       a*  ,  where 
^i.K**i        '^he  minimum  mean  squared  error  linear  predictor  o^''i  ^Crom 
^i.x  alon««    Since  a*  is  the  minimum  mean  squared  error  estimator, 
it  is  in  that  sense  optimal.    Unfortunately,  the  estimator  is  also  biased, 
as  shall  be  seen. 

Observe  that  for  y    to  be  an  unbiased  estimator  of  y,  for  every  i.  it 
is  required  that  ^ 


Now,  since  and  Y  are  jointly  normally  distributed,  the  conditional 
expectation  of  Y^,  given       ^,  is 

E  (Y     |Z      )  =  U      (y  -  r  D    )  +  Z  D 

•X  •    1  ,x'  ,x       '  1    ly'  1  .X  ly 

where 

D     =  E"^    a  , 
ly         11  ly' 

=  Var  (z^^)  and 


445 


463 


Consequently r  the  conditional  (on       ^)  expectation  of       ^a*^  is 


z     (z  'z    y^z  'u(Y~rD)+z  d. 

1..  X    ^    1.x    1.x'        1.x      .x^'  1    lY^  1.x  ly 

Assuming  that  the  sample  size  N  is  large  and  replacing  Z  '  Z  and 
Z^  '  U     with  their  expectations  produces 

1  .  X         .  X 

E(2    a*|z    )  =  z     (E    +  ryr  (y  -  r  D  ) 

-l.xl'l.x'  l.x^ll  1      l'  !       ^  11y' 

+     Z  D 

1.x      1 Y 

where 


^=(0^0  ^)/(N-in' ) 
Thus,  unconditionally, 


E  (Z      a*)  =  U    Y  +  BIAS 

~  1  .  X     1  .X 

where 


BIAS  =  u    (r  (z    +  r'YF  )"^r'Y  -  i)(y  -  r  e"^  a  ) 

.x^l^ll  ll'         1  111  lY^ 

and  so       is  an  unbiased  estimator  of  y^  only  when  BIAS  =  0. 

The  bias  of  y^  will  be  zero  when  any  of  the  following  three  conditions 
obtains: 


(1)  U  ^  »  0  so  that  U  is  contained  in  the  column  space  of  X  (and 
thus  all  information  in  U  is  contained  in  X) ^  or 

(2)  ■  0  and  y  =  Ta  a  0  so  that  the  original  specification 
Z  =  XB  +  E  was  correct r  or 

(3)  Y  -  *^iY  =  0  ^hi^h  ^^^^ 

occur  whenever  Y      =         c  +  6 
where  c  is  a  vector  independent  of  Z       and  c       a  random 
variable  independent  of  Z.  ^  and  witfi*  expectation  0.    A  par- 
ticular case  of  this  is  when  the  scores  on  all  items  are 
known,  in  which  case  Y      s  Z,  a. 

446 


Observe  that  it  is  not  sufficient  to  assume  that  r  =  Iyy-.-y],  so  that 
the  relationships  between  the  conditioning  variable:  U  and  the  vector  of 
scores  on  each  item  is  the  same.  For  although  then  r,  =  yl'     ,  the  value 
or  the  bias  is 

U       (YY'l'f  -  I)  Yg 

where 


f  -  1'  (Y'tY)!  and 

g  -  1  -  1'  Z"^  a 
1 1  ly 

are  both  scalars.    Under  the  assumption  that  T    =  yl'     ,  the  bias  is 
non-zero  unless  g  =  0.  ^  "i 

^      Since  the  predicted  value       is  biased,  so  is  the  plausible  value 
>  -  y  +  e,  where  e  is  a  random  normal  deviate  with  expectation  0  and 
variance  a  ^  where 

RES 

is  the  residual  mean  squared  error. 

Although  9^  and       are  generally  biased  for  each  i,  certain  estimates 
which  are  linear  combinations  of  the  set  of  plausible  values,  one  for  each 
person,  are  not. 

Let  P,IAS.  be  the  row  of  the  BIAS  matrix  corresponding  to  respondent  i. 
We  may  write  o 


BIAS    =  u  X 

1  i  .  X  1 

^^^^^  "v.ic              i*"*  row  of  U  ^  and       is  the  remainder  of  the  DIAS 

matrix,  which  depends  on  the  particular  set  of  exercises  answered  bv 

respondent  i.  ' 


447 


Then  the  N  x  1  vector  Y  of  predicted  scores  for  all  N  respondenis  has 
expectation 


E(Y)  =  X(X'X)"^  X'E(Y)  +  U    Y  ^  U  A 

—  —  .  X  .  X 


where  A  is  the  block  diagonal  matrix 


X     0  ...  0 

1 

0     X  ...  0 

2 

0     0  ...  X 


Let  L'Y  be  any  linear  combination  of  che  unobs^srved  scores  Y  which  is 
an  unbiased  estimate  of  some  population  parameter  (or  vector  or  matrix  of 
parameters)  T.    The  corresponding  estimate  of  T  based  on  the  predicted  (or 
plausible)  values  has  expectation 


L'  IX(X'X)"^X'(Xp  +  Uy)  +  U    Y  +  U    A  ] 


=  L'  (X3  +  Uy)  +  L'U  A 

.  X 


T  +  L'U  A 

.  X 


where  L'U      A  is  the  bias  of  the  estimate. 


Hence,  if  L  is  a  linear  combination  of  the  columns  of  X,  so  that  L  r=  XW 
for  some  V  then  since  L'U     =  V'XU      =  0,  the  estimator  is  unbiased. 

.X  .X  ' 

In  particular,  the  plausible  values  allow  the  unbiased  estimation  of 
any  linear  function  of  the  parameters  3*  =  3  +  (X'X)""  X'Uy- 

Vhen  L  is  not  in  the  column  space  of  X,  the  estimator  L'y  provides  a 
biased  estimate  of  T  where  the  amount  of  bias  depends  on  the  relative 
amount  of  information  in  U  ^  not  being  accounted  for  by  the  observed 
responses. 


AA8 


ERLC 


4B6 


Measures  of  the  potential  amount  of  this  bias  for  the  NAEP  writing 
plausible  values  are  presented  in  Section  11.4.6.    Prior  to  that,  we  turn 
to  the  application  of  the  average  response  method  of  estimation  to 
construct  those  values. 

II-*-'*    Application  of  the  Average  Response  Method  to  NAEP  Writing  Data 

The  writing  scale  for  the  Year  15  assessment  was  based  on  a  set  of  ten 
writing  exercises  and  was  constructed  by  an  application  of  the  average 
response  method  (ARM)  to  the  observed  responses  to  these  items.  ARM 
writing  scale  plausible  values  were  computed  for  each  student  who  was  in 
one  of  the  three  modal  grades  (grades  4,  8  and  11)  and  who  additionally 
responded  to  at  least  one  of  these  ten  writing  items.    This  section  details 
the  construction  of  these  plausible  values. 

The  ten  writing  exercises  which  were  selected  to  form  the  writing  scale 
are  listed  in  Table  11.4(1).    These  particular  e-ercises  vare  chosen 
because  all  inter-item  correlations  are  estimable  and  because  each  of  the 
items  were  administered  to  at  least  two  of  the  grades  (with  one  exception 
noted  below).    The  selected  exercises  constitute  the  complete  set  of 
writing  exercises  which  satisfy  both  of  these  criteria. 

The  letters  "A"  through  "G"  in  Table  11.4(1)  give  the  grouping  of  items 
into  blocks  for  the  purposes  of  administration  of  the  items  to  students. 
These  blocks  are  a  part  of  the  full  set  of  nineteen  blocks  of  items  which 
were  admini^itered  by  BIB  spiralling  in  the  Year  15  assessments  of  reading 
and  writing.    Details  of  the  BIB  spiralling  appear  in  Chapter  5.  The 
pertinent  characteristics  for  present  purposes  ure  that  every  block  of 
Items  and  every  pair  of  blocks  of  items  are  administered  to  randomly 
equivalent  subsamplas  of  students.    Approximately  2,000  students  at  a  given 
grade  responded  to  any  one  block  of  items;  approximately  200  of  a  given 
grade  responded  to  a  pair  of  items  in  different  blocks. 

As  indicated  in  Table  11.4(1),  the  entire  set  of  ten  exercises  was 
assessed  in  Grade  8  while  e^ght  of  the  exercises  were  presented  to  students 
in  Grade  4  and  6  to  student,  in  Grade  11.    Nine  of  the  exercises  were 
presented  to  at  least  two  grades  with  information  on  five  of  the  exercises 
obtained  from  all  three  grades.    (The  remaining  item,  N000502,  was 
presented  to  only  Grade  8  but  was  included  because  it  could  be  linked  to 
item  N000602  which  was  also  given  in  Grade  4).    The  fact  that  not  all  items 
were  presented  at  each  grade  has  consequenc  5  in  the  estimation  of  the 
cross-product  matrix  C  and  in  the  estimation  of  plausible  values.    None  of 
the  students  took  more  than  four  of  the  writing  exerclr.es  and  the  majority 
took  only  one  or  two  of  the  ten  exercises.    The  exact  distribution  of  the 
number  of  items  taken,  by  grade,  is  shown  in  Table  11.4(2). 


449 


46'V 

ERIC 


Table  11.4(1) 
NAEP  Writing  Items  for  the  ARM  Writing  Scale 


  Block   

Item  Grade  4  Grade  8  Grade  11 


N000102 

Dali 

A 

A 

A 

N00O202 

School  Rule 

B 

B 

B 

N000302 

Recreation  0pp. 

C 

C 

K'^'}D402 

Food  on  Frontier 

D 

D 

N000502 

Dissecting  Frogs 

E 

N000602 

XYZ  Company 

E 

E 

N000702 

Swimming  Pool 

F 

F 

F 

N000802 

Pets 

F 

F 

N00O902 

Radio  Station 

G 

G 

N001002 

Appleby  House 

G 

G 

G 

Table  11.4(2) 

Distribution  by  Grade  of  the  Number  of  Writing  Scale 
Items  Taken  by  a  Student 


Number  of  Students 


Items  Taken  Grade  4  Grade  8  Grade  11 

1  4,570  4,261  7,979 

2  2,883  3,741  2,195 

3  1,022  1,966  483 

4  332  1,124  0 


Total  8,807  11,092  10,657 


450 


46S 


As  was  noted  in  Section  11.4.2,  the  basis  for  estimation  of  a  predicted 
value  for  any  given  student  is  the  fu:^i  cross-products  matrix 


C  - 


X'X  X'Z 
Z'X  Z'Z 


ircm  which  all  other  necessary  matrices  and  estimates  are  derived.    For  the 
construction  of  the  NAEP  writing  scale,  this  matrix  C  was  formed  by 
creating  an  analogous  matrix  for  each  grade  and  then  pooling  the  resulting 
matrices  together. 

In  the  matrix  C,  and  in  the  grade  analogues  C.,  C    and  C--,  the 
conditioning  matrix  X  is  a  0-1  design  matrix  specifically  controlling  for 
the  main  effects  of  the  following  conditioning  variables: 


Grade 
Sex 

Race/Ethnicity 


Size  and  Type 
of  Community 


Region 


Parental 
Education 


orade  4 
Grade  8 
Grade  11 

Male 
Female 

White 
Black 
Hispanic 
Other 

Advantaged  urban 
Disadvantaged  urban 
Other 

NE  (Northeast) 
SE  (Southeast) 
C  (Central) 
W  (West) 

Less  than  High  School  Grad 
Graduated  High  School 
Post  High  School 
Unknown 


The  values  of  the  conditioning  variables  are  known  for  all  students  and 
so  X'X  in  ea-h  of  the  cross-products  matrices  is  directly  obtained  by 
taking  the  sum  of  csquares  and  cross-products  of  the  conditioning  variables 
for  each  student,  weighting  these  by  the  student's  sampling  weight  and  then 
summing  across  all  students  of  the  given  grade. 


451 


46^ 


For  example^  let  X  be  the  conditioning  matrix  for  the  sample  of 
students  in  Grade  8.    This  sample  ^as  drawn  using  a  complex  sample  design 
with  unequal  probabilities  of  selection  of  the  various  respondents  in  the 
sample.    To  account  for  these  differential  probabilities  of  selection^  each 
student  is  assigned  a  sampling  weight  which  is  the  reciprocal  of  the 
probability  that  the  student  was  selected  (and  which  also  contains 
adjustments  for  nonresponse  and  post-stratification).    An  (approximately) 
unbiased  and  consistent  estimate  of  the  cross-product  matrix  X'X  for  the 
population  of  students  in  Grade  8  is  the  traditional  weighted  cross-product 
matrix 


(X'X)^ 


Z    w     X  X 

81      8  i      8  i 


where 
var 


re  Xg^  is  the  row^vector  giving  the  valines  of  the  conditioning 
iables  for  the  i      student  and  w^^  is  the  student's  weight. 


Since  no  student  responded  to  all  writing  items  the  remaining  terms  of 
the  complete  cross-product  matrices  C^f  Cg  and  C^^f  the  terms  X'Z  and 
Z'Z  cannot  be  directly  estimated  in  this  manner.    However,  the 
characteristics  cf  the  BIB  spiralling  assignment  of  exercises  to  students 
allows  the  consistent  estimation  of  the  components  of  these  terms  related 
to  the  pool  of  exercises  assigned  to  a  given  grade.    The  procedure  used  to 
accomplisn  this  estimation  is  discussed  next.    Since  all  ten  items  were 
presented  to  Grade  8,  this  produces  the  final  estimate  of  the  cross-product 
matrix      .    Because  not  all  exercises  were  presented  to  Grade  4  and  Grade 
11  students,  additional  work,  discussed  subsequently,  is  needed  to  estimate 
the  matrices       and  C, , . 

The  cross-product  matrix  C  ,  for  Grade  8  students,  is 


(X'X)^ 
(Z'X)^ 


(X'Z)^ 
(Z'Z), 


The  submatrix  (Z'Z)  is  to  be  a  consistent  estimate  of  the  10  x  10  item 
score  cross-product  matrix  that  would  have  been  obtained  had  all  students 
in  che  Grade  8  population  responded  to  each  of  the  ten  writing  items?. 
Typical  elements  of  this  matrix  are  the  estimated  population  sum-of-squared 
scores  for  a  given  item  (say  the  first)  (Z^'Z^)^  and  the  estimated 
population  sum-of-products  of  scores  for  a  given  pair  of  items  (say  the 
first  and  second)  (Z/Z,)„. 

Because  of  the  BIB  spiral  design,  we  can  assume  that  the  set  of 
(approximately  2,000)  students  in  Grade  8  who  responded  to  a  given  ilem  is 


452 


ERIC 


4m/ 


a  representative  sample  of  the  population  (of  all  students  in  Grade  8  who 
would  have  responded  to  the  item  had  it  been  presented  to  them). 
Consequently,  the  appropriately  weighted  sample  mean,  Z  ,  and  the  weighted 
sample  variance,  S  ,  based  on  the  total  sample  of  stude?its  in  Grade  8 
responding  to  the  first  item,  are  consistent  and  unbiased  estimates 
of  the  population  mean  and  variance  for  that  item.    A  consistent  estimator 
of  the  sum-of-sr'iared  scores  in  the  population  of  a  given  item  (e.K..  the 
first)  is 


(Z  '  Z  )    =  W_    (S'  +  Z  ') 

1         18  TOT       ^1  1  ' 

where  W^^^  is  the  sum  of  weights  of  all  Grade  8  stude..ts. 

The  consistent  estimation  of  sum-of-products  of  scores  in  the 
population  is  enabled  by  the  observation  that,  due  to  the  BIB  spiralling, 
the  sample  of  (about  200  students  in  Grade  8  who  responded  to  a  given  pair 
of  items  (in  different  b.ocks)  is  also  a  representative  (albeit  smaller) 
sample  of  the  population.    Consequently,  the  appropriately  weighted  sample 
correlation,  r^^,  based  on  the  students  in  the  grade  who  responded  to  both 
(the  first  and  second)  items  is  a  consistent  estimator  of  the  population 
correlation  between  these  items.    A  consistent  estimate  of  the 
sum-of-products  of  scores  on  these  items  in  the  population  is: 

(Z  'Z  )  =  W         (S  S  r      +  Z    Z  ) 

1      2    8        TOT  1    2    12  1  2 

where  3^  and       are  unbiased  and  based  on  the  full  set  of  Grade  8  students 
responding  to  iten.  1  and  S    and  Z    are  unbiased  and  based  on  the  full  set 
of  Grade  8  students  responding  to  item  2. 

The  estimation  of  the  terms  in  the  matrix  (X'Z)  was  accomplished  in  an 
analogous  manner. 

The  resultant  cross-product  matrix  Cg  is  a  consistent  and  approximately 
unbiased  estimator  of  the  cross-product  matrix  for  the  population  of  Grade 
8  students,  but  it  is  not  the  maximum  likelihood  estimator,  which 
essentially  requires  estimation  of  the  responses  of  each  individual  to  the 
items  not  presented  to  that  individual,  this  estimation  based  on  the 
available  information  from  that  individual  and    le  interrelationships 
between  items  observed  in  the  entire  sample,    however,  because  the  missing 
information  (the  items  not  presented  to  the  individual)  can  be  quite 
reasonably  assumed  to  be  missing  due  to  a  random  process  unrelated  to  the 
measurements  of  interest,  the  practical  difference  between  the  estimator  C 
and  the  maximum  likelihood  estimator  is  likely  to  be  small  and  overwhelmed^ 
by  the  sampling  variability.    Actual  comparison  of  the  maximum  likelihood 
estimator  and  the  estimator  C.  bears  this  out. 


453 


ERIC 


471 


Because  not  all  items  were  presented  at  Grades  4  and  11 ^  the 
cross-product  matrices >       and  C^^f  for  those  grades  had  missing  cells ^ 
corresponding  to  the  items  which  were  not  presented.    For  Grade  4^  there 
were  two  missing  items  (N00302  and  N00502).    The  cells  of  the  matrix 
corresponding  to  these  missing  ^  •^'^ms  (which  includes  all  sums  of 
cross-products  involving  either  m*  sing  item)  were  filled  in  by 

(1)  assuming  that^  for  the  population  of  Grade  4  students ^  the 
conditional  distribution  of  the  two  items  given  the  background 
characteristics  and  responses  to  the  8  items  actually  assigned 
to  Grade  4  is  the  same  as  the  equivalent  conditional 
distribution  for  the  population  of  Grade  8  students^  and  is 
multivariate  normal^ 

(2)  estimating  *-his  conditional  distribution  from  the  Grade  8 
sample^  aid 

(3)  combining  this  estimate  with  the  known  information  obtained 
from  the  Grade  4  sample. 


Specifically^  by  appropriate  permutation  of  its  rows  and  columns^  the 
Grade  4  cross-products  matrix  C.  can  be  written  as 


X  'X 

4  4 

X  'Z 

4  1 

(  4  ) 

X  'Z 

4  2(4) 

Z  '  X 

1(4)  4 

Z  ' 

1(4) 

z 

1(4) 

Z  '  z 

1(4)  2(4) 

Z  '  X 

2(4)  4 

Z  ' 

2(4) 

z 

1(4) 

Z  '  z 

2(4)  2(4) 

where  X    is  the  conditioning  matrix  for  Grade  4  (with  the  dummy  variables 
for  Grade  8  and  Grade  11  removed) >  Z^^.^  corresponds  to  the  set  of  eight 
items  presented  to  Grade  4  students  ana      ^ ^ ^  to  the  remaining  two> 
unpresented^  items.    Writing       as  the  matrix  [X^  ^i(4|l  known 
information  for  the  grade >  the  matrix       can  be  rewritten  as 


V  'Z 

4  2(4) 

Z  '  z 

2(4)  2(4) 


V  'V 

4  4 

Z  '  V 

2(4)  4 


For  notational  convenience^  we  will  operate  as  if  the  entire  population 
of  Grade  4  Jtudent:>  had  been  measured  and  that  complete  information  by 
student  is  available  for  ail  columns  cf  the  matrix  V..    There  is  no  loss  of 


454 


ERIC 


472 


fre^iiJed^""'^  estimates  of  the  terms  of  the  cross-product  matrix 

infnrI!!.•"^•~P''°*'"^*  ^4 ' ^4  is  estimated  from  the  weighted  pairwise 

^""^"'^^        "^""^  ""^""^  "^^d  t°  estimate  the  terms  of 

the  Grade  8  cross-product  matrix  C. .    However,  since  the  items  in  Z 
were  not  presented  to  the  Grade  4,  no  direct  estimates  of  V  'Z 
2(4)^(4)  ^""^  available  which  use  onJy  Grade  4  data.    ThesI  tlrm^  are 

accordin^xy  estimated  on  the  basis  of  relationships  present  in  the 
drade  8  data. 

,.^°"f°'^'"3''ly  permute  and  partition  the  Grade  8  cross-products  matrix  C 
so  that  It  may  be  written  » 


^8  = 


V  'V 

8  8 


V  'Z 

8  2(8) 


Z    '  V  Z    '  z 

2(8)8  2(8)2(8) 


!lIlS'^r,.!a/i  Pa  22(8)1'X    being  the  conditioning  matrix  (with  the  Grade  4 
^^1  nn!f.^    aummy  variables  removed  and  the  Grade  8  dummy  variable  in  the 
same  position  as  the  Grade  4  variable  for  the  matrix  X  ^. 

4 

Assume  that 


where  B^^^^  is  an  mx8  matrix  of  unknown  constants, 

^2(4)  is  an  mx2  matrix  of  unknown  constants, 

^2(4l1       ^  matrix  of 
distributed  N  {O.t),    ndependent  of 

Additionally,  assume  tha: 


unknown  errors,  each  row  of  which  is 
the  othei  rows. 


f'x<8)    Z2,8)I    =   ^8    f«i(8,    ^,8,1    ^    fE,,3  ^,3)! 

Where  the  rows  of  [E^^,,,      ^ , ,  J  are  also  independently  N  (0,E)  distributed. 

Under  these  assumptions  and  supposing  that  the  elements  of  the  Grade  4 
matrix  V,  were  completely  known,  it  is  possible  to  construct  an  estimator 
llfllLVj  "ff,         available  Grade  4  data  and  the  linear  relationship 

between  Z^,^,  and       estimated  from  the  Grade  8  data.    This  estimator  of 
^2(4 > 


455 


473 


Z         =  V    (V  '  V  )'"^  V  '  z 

2(4)  48         8^  8  2(8) 


The  corresponding  estimator  of  the  cross-product  matrix  '  ^2(4) 
which  only  requires  knowledge  of  the  matrix  V^SV^  is: 


V  'Z         =  (V  'V  )  (V  'V         V  'Z 

4      2(4)         ^4      4^8      8'  8  2(8) 


The  expected  value  of  V  'Z  is: 

^  4  2(4) 


E(V  'Z  ) 

4  2(4)^ 


E(X  'Z  ) 

-^4  2(4)' 

E(Z  '      Z  ) 

1(4)  2(4)' 


4         4  2(4) 

B'      X'  X    B*        +  N  Z 

1(4)4      4      2(4)  12 


where  B*         =  B         +  (B         -  B       )  Z"^  Z 

:?^4)  2(8)         ^1(4)  1(8)'      11  12 


11 


Var  (E  ) 

^  1(4)' 


and 


12 


Gov  (E       ,  E  ). 

^    1(4)'  2(4)' 


This  estimator  is  biased  unless  B  ,      =  B^^ 

2  ;  4  )  2(4) 


The  obvious  estimator  of  Z    '    Z         is  Z    '    Z        which  has 

T  2(1)2(4)  2(4)2(4) 

expected  value 

E(Z    '    Z      )  =  B*'        X  'X  B*       +  N  E      Z       L  +E(G)E 

2(4)      2(4J  2(4)      4      4    2(4)  21      11        12  2.1 

where  E  =  Var(E  )  -  E  E  E  is  the  conditional  variance  of 
E        ghkn  E  ^/.^'where 

2(4)*  1(4) 

G  =  trace  [(V  '  V        V  'V J  . 

8         8  4  4 


2( 


Even  if  B*  =  B       »    Z  '      Z  provides  a  biased  estimate  of 

2(4)  2(4)'         2(4)      2(4)  ^ 

'    Z         .  the  bias  being  due  to  reduction  In  variability  due  to 

4)2(4)'  ^ 


456 


ERIC 


474 


prediction  by  regression.    The  value  of  the  bias  is 
(N  -  E(G))  E^^. 


Let        2       =  (  Z'      Z         -  Z'        V  (V  'V  r^V'Z 

2.1        N-m-p^^      2(8)    2(8)  2(8)      8^V3    ^8^        8    2(8)    ^  ' 


where 


Ti  +  p^      is  the  number  of  columns  of  V  and 

8 

^2.1  the  residual  mean  square  for  prediction  of 

^2(8)       ^8         ^1(8)'         is  an  unbiased 
estimator  of       x  • 


Further, 


(N~G)  E^.i  is  an  unbiased  estimator  of  (N-E(G))       i  • 
The  appropriate  estimator  of  the  cross-product  matrix  Z  '    Z  Is 

2(4)  2(4) 

^('4>2(4)=  Z,;,,Z,,,,  MN-G)  E^^ 
which  has  expected  value 

7.(Z  '     Z        )  =B*  '  X  'X  B*  +  N  Var  <V  \ 

■      2(4)    2(4)''        2(4)    4      4^2(4  )      ^   "  ^  2(4)^ 

and  thus  is  unbiased  whenever  B*       =  B 

2(4)  2(4)* 

Estimation  Of  the  missing  cells  in  the  Grade  11  cro-ss-product  matrix 
L     was  accomplished  in  an  analogous  manner,  again  using  the  relationships 
rrom  the  Grade  8  data.  ^ 

Finally,  the  overall  cross-product  matrix  C  was  formed  by  pooling  the 
grade  level  cross-product  matrices      ,  C,  and  C...    In  this  pooling,  the 
main  efferts  of  grade  (i.e.,  the  intercept  term'in  each  of  the  grade  level 
cross-prouuct  matrices)  were  kept  separate.    All  other  conditioning  effects 
(the  main  effects  of  race,  region,  size  and  type  of  community  and  pp^ental 
education)  were  pooled  across  grades  as  were  the  item-by-item  cross-product 
matrices. 


457 


473 


The  resultant  matrix  C  was  then  used  as  the  basis  for  constructing  the 
matrix       as  was  detailed  in  Section  11,A,2,    The  estimation  of  plausible 
values  for  all  students  in  all  three  grades  was  accomplished  according  to 
the  formulas  in  Sections  11,4.2  anrl  11,A,3  using  the  matrix       as  the 
basis.  To  approximately  account  for  the  effects  of  the  sample  design  and 
the  amount  of  information  available,  the  matrix  C    was  scaled  to  be 
consistent  with  a  sample  size  of  200,    Five  plausible  values  were  computed 
for  each  student.    Note  that  the  additional  source  of  uncertainty  due  to 
the  estimation  of  parameters ^  described  in  Section  ll,4,3j  was  not 
included.    This^  in  effect >  assumes  that  Ihe  sample  size  is  large  enoogh  so 
that  the  variance  contribution  due  to  the  regression  parameters  can  be 
neglected. 


11.4.5    An  Alternate^  Unbiased^  Estimator  for  Linear  Combinations 
of  Mean  Writing  Scores 

Section  11. 4. A  showed  that  the  AlxM  scale  values  produce  estimates  of 
composites  which  are  generally  biased  with  the  amount  of  bias  related  to 
the  amount  of  information  in  the  neglected  conditioning  variables  U  which 
is  not  linearly  contained  in  the  employed  conditioning  variables  X, 

For  a  wide  class  of  statistics  an  alternate  estimation  procedure  is 
available  which  produces  estimates  which  are  unbiased^  regardless  of 
whether  or  not  all  appropriate  conditioning  variables  have  been  identified. 

This  alternate  procedure  is  based  on  the  facts  that 

(1)  the  target  quantity  of  interest ^        is  a  linear  combination  of 
the  component  quantities      ^      > . . . >  Z^  (so  that  Y  =  Za),  and 

(2)  information  on  the  values  of  each  of  the  item  score  variables > 
the  Z^f  is  available  on  a  representative  subsample  of  the 
population. 


Suppose  that  the  value  of  the  mean  scoire  across  the  p  items  were  known 
for  every  individual  in  the  sample^  so  that  the  vector  Y  were  completely 
known,  and  consider  the  rtatistic 

t  -  L'Y, 

for  some  vector  or  matrix  L.    Thus  t  is  a  linear  combination  of  the 
elements  in  Y.    Examples  of  this  are  subgroup  means >  contrasts  of  subgroup 
means^  and  more  generally^  regression  coefficients. 

Suppose  that  t  is  an  unbiased  estimator  of  the  population  value  T. 
Then^  since 

T  =  E(t)  =  E(L'Y)  =  E(L'Z)a  , 

the  qiiantity  of  interest  T  can  be  expressed  as  a  linear  combination  of 


458 


ER?C  47o 


?h!'^^!!nrL'';^"*^'?^^'  Ti'  ^5^=^^       is  the  equivalent  population  value  for 
the  scores       on  item  i,  and  where       is  estimated  unbiasedly  by  the 
statistic  '  ' 


t.     =  L'Z. 


(For  the  moment  we  assume  that  the  score  on  item  i  is  known  for  all 
n**.,*  in  the  sample.) 

As  an  example,  if  T  is  a  vector  of  subgroup  means  of  the  average 
performance  across  the  p  items,  then       is  the  vector  of  subgroup  mean 
performance  on  the  specific  item  i  and'so  T  is  quite  evidently  the  average 
of  these  item  level  mean  performance  vectors. 

^tn/nl?'  ^J;f°"8^the  score  on  item  i  is  only  known  for  a  subsample  of 
students,  this  subsample  is  a  representative  sample  of  the  population. 
This  means  that  an  unbiased  and  consistent  estimator  of  the  item  level 
parameter  vector       based  only  on  the  available  information  from  the 
subsample  of  students  responding  to  the  item  is 

t*  =  l;  z* 

where  Z*  is  the  vector  of  known  scores  and  L'  is  the  matrix  of  associated 
values,  chosen  so  that  ^  aoau^iaieu 

E(t*  )  =  T^. 

In  the  example  wf.ere  7^  is  the  r  x  1  vector  of  r  subgroup  mean 
performance  levels  on  item  i,  the  corresponding  estimator  t*  is  the  r  x  1 
vector  of  the  weighted  mean  scores,  by  subgroup,  across  all'members  of  the 
subgroup  responding  to  the  item. 

Then,  since  t*  is  an  unbiased  estimator  of  T. ,  for  each  item  i,  it 
follows  automatically  that  ^ 


p 

t*  =    E  a  t* 

i  i 

i-1 


is  an  unbiased  estimator  of 


p 

T  =    E  a  T 

i  i 


A59 


o  477 
ERIC 


It  is  also  possible  to  obtain  an  estimate  of  the  simplinp:  variance  of 
t*  by  Jackknifing  the  matrix  [t*  y  .yt^^]  at  the  PSU  level  because,  due  to 
BIB  spiralling,  equivalent  samples  of  tl^e  population  of  students  within 
each  PSU  respond  to  each  item.  Let 


T    =  {t*  I*  J 

k        ^      Ik*        *  pk^ 

be  the  matrix  with  columns  corresponding  to  the  pseudo-replicates  of  the  t* 

corresponding  to  the         PSU  pair.    Then  the  pseudo-replicate  of 

t*  m  (t*,...,t*  ]a  corresponding  to  the  k^^  PSU  pair  is 
1  p 

and  the  jackknife  variance  estimate  of  t^,  which  accounts  for  inter-item 
covariances  is 

fi 

Var(t*)  =  Z  (t*  -  t*)  (t*  -  t*)' 


wl  ich  is  a  variance-covariance  matrix  of  order  r  where  r  is  the  number  of 
elements  in  the  vector  t*.    (For  a  further  discussion  of  jackknife  variance 
estimation  see  Chapter  13.) 

Because  the  estimator  t*  of  group  level  data  is  computed  as  a  linear 
combination  of  unbiased  estimators  of  the  corresponding  parameters  for  each 
of  the  constituent  part  of  y  »  Za,  and  because  this  linear  combination  is 
often  a  mean,  the  estimator  t*  will  be  referred  to  as  the  meanparts 
estimator. 

The  meanparts  estimator  of  some  quantity  of  interest,  say  a  group  mean, 
diff«r«  from  the  equivalent  estimator  based  on  the  ARM  sca'^**  values  in  a 
fundamental  vay.    The  average  response  method  seeks  to  oh^.      an  unbiased 
estimate  of  the  mean  writing  score  for  every  individual  (and  goes  further 
Oy  also  addressing  the  variability  of  that  estimated  score).    If  the  method 
is  successful,  meaning  that  all  potential  conditioning  variables  have  been 
included  :^n  the  model  (at  least,  to  a  practical  approximation),  then  any 
statistics  based  linearly  on  these  ARM  plausible  vaxues  are  automatically 
unbiased.    This  means  that  the  plausible  values  can  be  computed  once  and 
for  all  and  that  any  subsequent  analyses  can  be  performed  on  these  sets  of 
plausible  values,  treating  them  as  the  actual  values  of  y.    (The  analyses 
stil]  need  to  be  repeated  for  at  least  two  sets  of  plausible  values  to 
correctly  account  for  variability).    This  is  extremely  convenient  for, 
especially,  exploratory  analysis. 


460 


ERIC 


476 


of  «n^iIl?v?^^^^^^"''^^^*  meanpans  estimator  never  produces  an  estimate 
n.«nH  ?  value,  but  rather  directly  produces  estimates  of 

aggregate  quantities,  where  it  is  required  that  those  aggregates  can  be 
r!!ncH?»f  linear  combination  of  the  equivalent  aggregates  of  the' 

u^mI^IS     u^'!?^-.        advantage  of  this  is  that  such  estimates  are 
cnor?f?r*/     disadvantage  is  that  each  separate  analysis  requires  its  ovn 
rSnulnL  °^        pertinent  meanparts  estimator,  this  computation 

requiring  p  separate  computations:    one  for  each  of  the  items.  This 
produces  a  considerable  increase  in  the  computational  load  required  for 
exploratory  analysis.    Furthermore,  the  variance  of  the  meanparts  estimator 

?n?o^:::i^n^i;jeifjj:ijj;iy^!^^"^^°^  "^^^"^^  ^^^-^  --^'^^ 

compromise  might  be  to  conduct  initial  analysis  on  the  ARM 
plausible  values  and  use  the  meanparts  analysis  for  the  more  critical 
analyses  or  to  verify  the  results  suggested  by  the  ARM-based  analyses. 

How  well  this  might  work  is  addressed  in  the  next  section. 

^^•^•^    Comparison  of  the  ARM-Based  and  Meanparts  Estimators 
of  Subgroup  Mean  Writing  Scale  Scores  for  the  Year  l5 
Writing  Assessment  "  

mp^Ji^ln  K^Ii""  compares  estimates  of  subgroup  writing  performance,  as 
based  on  the  ARM  plausible  values  with  the  corresponding  meanparts 

ofJUf  ^n'cH?''"'*  ^^^""^'^  °^        subpopulation  mean  values  on  each 

of  the  constituent  items. 

«f  a  specified  subgroup  G  of  students  in  a  given  grade,  the  estimate 

Sfi..2    o    f average  writing  performance,  based  on  the  ARM  plausible 
values  9,  ir  the  weighted  mean  of  the  plausible  values  for  the  subgroup. 


where  w    and  9^  |fe  the  sampling  weight  and  plausible  value  (one  of 
a  set)  for  the  i     student  of  the  given  grade  and  specified  subgroup  and 
where  the  summations  extend  over  all  students  in  the  grade  who  are  also 
members  of  the  subgroup.    It  has  been  noted  that,  unless  the  subgroup 
corresponds  to  a  linear  combination  of  the  condition    ariables  X*  the 
statistic  t^  provides  a  biased  estimate  of  subgroup  performance. 

a.-J''^'  l^^^^f'^^^^o"  showed  that  if  all  ten  exercises  were  presented  to  the 
grad«,  an  unbiased  estimate,  the  meanparts  estimate,  is  available.  Since 
all  ten  items  were  only  presented  to  Grade  8  students,  this  is  the  only 
grade  at  which  we  have  truly  unbiased  estimates  of  our  defined  measure  of 


461 


ERIC 


479 


writing  performance.    (Provisionally  unbiased  estimates  of  relative 
performance  of  subgroups  at  the  other  grades  will  be  discussed  below). 

Restricting  our  attention  to  Grade  8  for  the  moment,  let  t*    be  the 
weighted  mean  score  on  ?tem  j  over  all  students  in  the  subgroup  responding 
to  the  item,  that  is 

t  .  =  ./Z  w.  . 

where  z^^  is  the  score  on  item  i  for  the  i^**  student  in  the  subgroup 

responding  to  the  item,  w|^  is  that  student's  weight  and  where  the 

summations  extend  over  all  students  in  the  grade  and  subgroup  who 
additionally  responded  to  the  item.    Because  the  students  in  the  subgroup 
additionally  responding  to  the  item  constitute  a  representative  subsample 

of  all  students  in  the  subgroups,  t*^  is  an  unbiased  estimate  of  the 
subgroup  mean  score  on  the  item. 

Then,  since  all  ten  items  were  presented  to  Grade  8,  the  unbiased 
meanparts  estimate  of  subgroup  performance  (across  the  ten  items)  is 


10  ^ 

t*    =  (1/10)  E  t  .  . 

G  .  G3 

3-1 

Analogous  estimates  of  subgroup  performance  based  on  the  ARM  plausible 
values  can  also  be  obtained  in  the  same  manner  for  students  in  Grades  A  and 
11.    However,  because  not  all  ten  items  were  presented  to  those  grades,  we 
cannot  directly  obtain  the  analogous  meanparts  estimates  which  pertain  to 
the  mean  of  the  ten  items.    Rather  than  making  additional  assumptions  about 
how  the  students  would  do  on  these  unpresented  items,  we  will  instead 
define  the  meaaparts  estimators  of  writing  performance  at  Grades  A  and  11 
to  be  the  average  of  the  scores  of  the  items  actually  presented  at  the 
grade.    The  meanparts  estimator  for  Grade  A  is  thus  based  on  the  mean  of 
eight  item-level  statistics  and  the  meanparts  estimator  for  Grade  11  on  the 
mean  of  six  item-level  statistics. 

The  resulting  estimates  of  the  average  grade  level  writing  performance 

are 

Grade  4  Grade  8  Grade  11 


ARM  1.58  2.05  2.19 

Meanparts         1.60  2.05  2.13 


A62 


ERLC 


4S0 


!ou     standard  error  of  each  of  these  means  is  .01.    As  they  should, 
the  ARM  and  meanparts  estimates  agree  for  Grade  8.    The  meanparts  estimate 
?f  ?  ,    /•  essentially  a  mean  of  the  eight  items  given  to  that  grade,  is 
slightly  larger  than  the  ARM  estimate  which  includes  a  prediction  of  the 
scores  on  the  remaining  two  items.    Since  those  two  items  were  deemed 
inappropriate  for  Grade  4,  being  presumably  more  difficult,  this  difference 
iVitltl  ^''Pe^tfd^^'i'^ection.    For  Grade  11,  the  meanparts  estimate  on  the 
Six  items  presented  to  the  grade  is  lower  than  the  ARM  estimate  which 
includes  predictions  for  the  four  items  given  at  Grade  8  but  deemed 
inappropriate  for  Grade  11,  being  presumably  too  easy.    The  difference  is 
again  in  the  expected  direction. 

«n  aIu  ^^*''^^Kf  ^^^^  °^  ^  subgroup  performance  estimate  based 

Pctt^^tS  n?"        ''f  "f'       important  only  to  the  extent  that  it  affects  our 
estinate  of  the  relative  standing  of  the  subgroup  in  relation  to  other 
subgroups  or  to  the  population  as  a  whole.    For  simplicity,  consider  the 
estimation  of  the  difference  in  performance  level  between  the  subgroup  and 
the  total  population  of  students  in  the  grade.    We  shall  define  the  group 
effect  to  be  this  difference.  ^  ^ 

The  ARM-based  estimate  of  a  group  effect  is 


D     =  t    -  t  , 

6  G  T 

the  difference  between  the  subgroup  mean  plausible  value  and  the  mean 
plausible  value  across  all  students  in  the  grade. 

The  corresponding  meanparts  estimate  is 


D_  =  t    -  t 


which  is  based  only  on  the  items  presented  to  that  grade. 

A  direct  estimate  of  the  bias  of  the  ARM  based  estimate  is  the 
difference 


D-D  . 

6  0 

f^^.^^'^fjes  11,  this  difference  also  contains  a  component  due  to 

the  estimation  of  missing  items.    This  component  can  be  assumed  away  bv 
making  the  presumption  that  the  group  effects  for  the  missing  items  on 
average  equal  the  group  effects  for  the  items  presented. 

To  compare  the  performance  of  the  ARM-based  group  effect  estimates  with 
the  equivalent  meanparts  group  effect  estimates  we  have  computed  both 


463 


481 


estimates  for  each  of  169  subgroups,  where  each  subgroup  is  defined  by  the 
response  to  one  of  44  background  and  attitude  questions,  In  the  common 
core,  which  were  asked  of  each  student.    The  effects  were  computed 
separately  for  each  of  the  three  grades.    Each  of  the  background  questions 
elicited  Information  about  the  students  de  ographlc  characteristics  (e.g. 
sex,  race,  ethnicity,  age),  home  environment  (e.g.  parental  education,  the 
presence  of  25  or  more  books  In  the  home),  or  school  experience  (e.g. 
number  of  book  reports  written). 

Figures  11.4-la,  11.4-lb  and  11.4-lc  are  plots,  for  Grades  4,  8  and  11 
respectively,  of  the  ARM-based  group  effects  versus  the  corresponding 
meanparts  group  effects  for  the  21  subgroups  which  correspond  to  the 
conditioning  variables:    sex,  race/ethnicity,  size  and  type  of  community, 
region  and  level  of  parental  education.    Because  these  variables  were 
explicitly  controlled  for,  the  ARM  and  the  meanparts  estimates  should  be 
closely  comparable.    The  figures  show  that,  in  general,  this  is  Jhe  case 
with  the  relationship  between  the  two  estimates  being  well-described  by  a 
line  which  differs  trivially  from  a  line  through  the  origin  with  a  slope  of 
1.    The  relationship  between  the  two  estimates,  vMle  quite  good,  is, 
however,  not  perfect.    The  major  discrepancies  occir  for  the  Grade  4  and 
Grade  11  data  and,  as  noted  on  the  figures,  correspond  to  subgroups  which 
constitute  a  relatively  small  proportion  of  the  population.    The  reason 
that  these  larger  discrepancies  appear  at  Grades  4  and  11  but  not  at  Grade 
8  is  because  of  the  estimation  of  performance  on  the  items  not  presented  to 
Grades  4  or  11.    Recall  that  the  estimates  of  the  four  items  not  presented 
to  Grade  11  was  based  on  the  relationships  between  those  items  (and  the 
remaining  six)  observed  at  Grade  8.    In  essence,  the  ARM-based  ^roup  effect 
for  Grade  11  is  a  weighted  average  of  the  Grade  11  group  effects  for  the 
six  items  presented  to  Grade  11  and  the  Grade  8  group  effects  for  the  four 
items  not  presented  to  Grade  11. 

The  variability  of  the  plots  of  ARM  versus  meanparts  group  efects  about 
the  45  degree  line  is  due  to  the  pooling  of  the  cross-product  matrices  , 
C    and  C     prior  to  estimation  of  plausible  values.    This  pooling 
cSnstrainI  the  values  of  the  group  effects  when  averaged  across  grades,  but 
does  not  constrain  the  values  of  the  within-grade  group  effects.  This 
corresponds  to  the  assumption  that  the  differential  performance  of  a 
subgroup  relative  to  the  population  is  the  same  regardless  of  grade. 
Because  of  the  generally  low  variability  of  the  points  in  the  plots  about 
the  45  degree  Ine,  this  assumption  appears  quite  reasonable. 

Of  great  int    -st,  of  course,  i.?  how  the  ARM-based  group  effects 
estimators  perform.       subgroups  which  were  not  specifically  conditioned  on. 
This  is  indicated  in  ti^        11.4-2a,  11.4-2b  and  11.4-2c  which  show  the 
plots,  by  grade,  of  the  tu.     -sed  group  effects  versus  the  meanparts  group 
effect  for  136  of  the  remai..       '48  subgroups  which  were  not  explicitly 
conditioned  on.    (Each  of  the  i  ....aining  twelve  subgroups  were  based  on 
fewer  than  100  respondents  and  were  removed  on  the  grounds  that  ettects 
could  not  be  reliably  estimated.) 

Section  11.4.3  showed  that  the  potential  degree  of  the  bias  of  the 
ARM-based  group  effect  depends  on  the  strength  of  the  relation  between  the 

464 


ERIC 


Figure  11.4-la 


GRADE  4 
CONDITIONED  VARIABLES 
GROUP  EFFECTS 


0-3- 


A.R.M.  GROUP  EFFECTS 


465 


483 


Figure  11.4-lb 


GRADE  8 
CONDITIONED  VARIABLES 
GROUP  EFFECTS 


0*       -0-3       -0-2       -0  1         0  0  0  1  0-2  0  3  0* 


A.R.M.  GROUP  EFFECTS 


Figure  11.4-lc 


GRADE   1 1 
CONDITIONED  VARIABLES 
GROUP  EFFECTS 


-03- 


UNKNOWN  PARENTAL  EDUCATION 


-0  4        -0  3        -0  2        -0  1         0  0  0  1  0'^  0  3  0  4 

A.R.M.  GROUP  EFFECTS 


467 


485 


Figure  11.4-2a 


GRADE  4 


O 
Ul 
lu 
lu 
Ul 

a. 

D 
O 

oc 
o 

(/) 
I- 

< 

Q. 

z 
< 

2 


UNCONDITIONED  VARIABLES 

GROUP 

EFFECTS 

03- 

0-2- 

01  - 

-01  - 

-0-2- 

-0  3- 
-0-4- 

1          1  1 

REMAINING 
UNCONDITIONED  (0) 

UKE 

CONDITIONED  (+) 


-0-4       -03       -0-2       -01         00  0  1  0-2  0  3  0-4 

A.R.M.  GROUP  EFFECTS 


468 


ERIC 


5  b 


Figure  11.4-2b 


{/) 
h- 
O 
LU 
U. 
U. 
LU 

CL 
ID 
O 

a: 
o 

I/) 
I— 
a: 
< 

CL 

z 
< 


GRADE  8 
UNCONDITIONED  VARIABLES 


00 


-0-3- 


-0-4  - 


GROUP 

EFFECTS 

7 

REMAINING 
UNCONDITIONED  (0) 

O 

-0-4 


1^ 

-0-3        -0-2        -0  1         00  0  1  0-2 

A.R.M.  GROUP  EFFECTS 


0-3 


UKE     ,  . 
CONDITIONED  (+) 


I 

0-4 


469 


ERIC 


487 


Figure  11.4-2c 


O 
Ixl 


Q. 
D 
O 
Q£ 
O 

</) 
< 

a. 
z 

< 

2 


GRADE  11 
UNCONDITIONED  VARIABLES 


0-4-1 


03- 


0  2- 


01- 


00 


-01  < 


-0-2- 


-0-3  - 


-0  4 . 


PDni  ID 
oKUUr 

Lr  r  LU 1 o 

o 

t/  ♦  -ft}  ♦ 

/ 

REMAINING 
UNCONDITIONED  (0) 

-0-4 


I 

-0-3 


UKE 

CONDITIONED  (+) 


-0  2 


-01 


00 


1^ 

01 


0-2 


0  3 


I 

0  4 


A.R.M.  GROUP  EFFECTS 


ERLC 


470 


group  and  the  conditioning  variables  with  the  bias  being  much  smaller  for 
groups  which  are  highly  related  to  the  conditioning  variables.    For  this 
reason,  ve  have  divided  the  136  subgroups  addressed  by  Figures  11.4-2a,2b 
and  2c  into  two  sets.    The  first  set  consists  of  the  23  subgroups  formed  by 
demographic  variables  which  highly  resemble  one  of  the  conditioning 
variables.    Included  here  are  subgroups  based  on  the  Level  of  Father's 
Education  and  on  the  Level  of  Mother's  Education,  both  of  which  are  used  to 
construct  the  Parental  Education  conditioning  variable.    The  other 
subgroups  in  the  set  of  variables  which  are  like  the  conditioning  variables 
are  related  to  the  Race/ethnicity  conditioning  variable.    These  are- 
Language  Spoken  in  the  Home  (English,  Spanish,  Other);  Are  You  Hispanic? 
(No  or  Hispanic  subgroup);  and  Ethnicity  (American  Indian/Alaska  Native. 
Asian/Pacific  Islander,  Black,  White,  Other). 

The  second  set  of  subgroups  consists  of  the  remaining  113  subgroups 
which  are  not  so  directly  related  to  the  conditioning  variables. 

Examining  the  plots  in  Figures  11.4-2a,  2b  and  2c,  we  see  that  the  two 
sets  of  subgroups  tend  to  cluster  along  different  lines.    The  first  set 
the  subgroups  like  the  conditioned  variables,  is  indicated  by  +'s  on  th4 
plots.    The  relationship  between  the  ARM  and  meanparts  estimates  for  this 
set  tends  to  resemble  that  of  the  conditioned  variables.    This  is  most 
clear  for  the  Grade  8  data  (Figure  11.4-2b),  where  the  "like  conditioned- 
subgroups  cluster  tightly  along  a  least-squares  line  with  a  slope  of  1.09 
(with  a  standard  error  of  .03).    The  relationship  between  the  ARM  and 
meanparts  estimates  for  the  like  conditioned  subgroups  for  Grades  4  and  11 

8'  but  is  similar.    The  least-squares  slopes 
are  1.08  (standard  error  of  .09)  for  Grade  11  and  1.21  (standard  error  .10) 
for  Grade  4.    The  higher  variability  of  the  points  about  the  lines  for 
these  two  grades  may  be  partly  due  to  the  prediction  of  missing  information 
(the  unpresented  exercises)  at  those  two  grades. 

We  turn  finally  to  the  relationships  between  the  ARM-based  and 
meanparts  group  effects  for  the  subgroups  not  highly  related  to  the 
conditioning  variables.    This  set  of  subgroups  is  indicated  by  the  O's  on 
the  plots.    The  slopes  of  the  least-squares  lines  predicting  the  meanparts 
estimate  from  the  ARM  estimate  are  much  higher  for  this  set  of  subgroups, 
being  1.87  for  Grade  4,  1.56  for  Grade  8  and  1.82  for  Grade  11  (the 
standard  errors  are  all  about  .05).    This  indicates  a  tendency  for  the  ARM 
estimates  to  be  closer  to  zero  than  the  meanparts  estimates  so  that  the 
magnitude  of  subgroup  effects  will  tend  to  be  reduced  by  using  the  ARM 
estimates.    The  average  reduction  is  the  smallest  for  Grade  8  where  the 
ARM-based  estimates  are  around  64  percent  of  the  magnitude  of  the  meanparts 
estimates,  corresponding  to  a  shrinkage  of  36  percent.    The  average 
shrinkage  for  the  other  two  grades  is  larger,  being  roughly  45  percent  in 
both  cases.    Again,  this  is  due  to  the  higher  degree  of  prediction  of 
missing  information  (the  unpresented  exercises)  -ocessary  for  those  two 
grades. 

The  general  picture  so  far  is  that  the  estimates  of  group  effects  based 
on  ARM  plausible  values  will  be  essentially  unbiased  whenever  the  subgroups 
are  highly  related  to  the  conditioning  variables  but  tend  to  be 


471 


4b9 


undtrstatenents  of  the  sizes  of  group  effects  whenever  the  subgroups  are 
not  highly  related  to  the  conditioning  variables.    Ve  shall  see  that  the 
consequences  of  the  tendency  of  the  ARM  to  understate  the  size  of  &n 
effecti  relative  to  the  «eanparts  estinatort  are  Mitigated  to  a  large 
degree  when  the  variabilities  of  the  ARM  and  neanparts  estimators  are 
considered.    Generally  speakingt  the  sane  conclusions  about  subgroup 
effects  will  be  ude  based  on  either  of  the  two  estimators* 


Consider  a  test  of  the  hypothesis  of  no  subgroup  effect  for  the  set  of 
subgroups  defined  by  the  responses  to  one  of  the  44  background  and  attitude 
questions.    Bach  of  these  44  questions  produces  a  partitioning  of  the 
population  into  between  two  to  ten  subgroups.    The  analogue  to  the  standard 

statistic**  from  a  one-way  analysis  of  variancei  vhich  approximately 
takes  the  sample  design  into  account,  is 


0 

2  -  t^)^ 


P  -     1     (1  -  G  )  i»i  ^  ^ 


n  0 


I  f^(f^  -  l/nJVar(t^) 
i-i 


where        G        is  the  number  of  subgroups 

is  the  (ARM  or  meanparts)  estimate  of  the  i^^  subgroup  mean 
performance 

t^       is  the  equivalent  estimate  for  the  population 
(so  that  t^  -  t^  is  the  subgroup  effect) 

Var(t  )    is  the  estimate  of  the  variance  of  t    (which  includes 

^     uncertainty  in  estimation  of  plausible  values  for  the  ARM 
estimate) 

f^       is  the  weighted  relative  frequency  of  the  subgroup  in  the 
population*  and 

n        is  the  effective  sample  size. 


♦ 


The  effective  sample  size  is  the  observed  sample  size  divided  by  the 
design  effect  and  approximately  accounts  for  the  fact  that  estimates  of 
variability  which  take  the  sample  design  into  account  tend  to  be  larger 
than  conventional  (simple  random  sampling  based)  estimates  by  a  factor 
equal  to  the  design  effect  (see  Chapter  4  for  details).    Por  the  current 
computations,  the  effective  sample  size  was  set  equal  to  IfOOO  in  all 
cases • 

To  compute  the  apptoximate  significance  level,  the  above  F  statistic 
was  compared  with  the  P  distributicn  with  G  -  1  and  32  degrees  of  freedom. 


472 


4jO 


The  denOMiimtor  degrees  of  freedom,  32,  is  equal  to  the  nuoiber  of  PSU  pairs 
used  in  calcuUtinf  the  jackknife  variance  of  any  statistic  based  on  the 
data  fro«  the  Tear  IS  assessMnt  andi  as  discussed  in  Chapter  4,  is  an 
upper  bound  for  the  degrees  of  freedom  of  that  variance  esti::ste. 
Simlation  results  by  Stiah,  Holt  and  Folsoa  (1977)  indicate  that  this  is  an 
appropriate  number  of  error  degrees  of  freedom  to  use  for  significance 
tasts. 

For  each  grade,  and  each  of  the  44  background  and  attitude  quest ion:^,  F 
tests  in  the  manner  described  above  v^re  conducted  using  the  ARM  plausible 
values  and  using  the  meanparts  estimates  and  the  results  compared.  To 
eliminate  the  effect  of  different  numbers  of  denominator  degrees  of 
freedom,  the  results  vere  converted  into  cumulative  probabilities 
(«  I  •  significance  level)  and  then,  to  facilitate  plotting,  into  standard 
normal  deviates.    That  is,  the  values  compared  vere 


.        (Prob  (F^_^^^  i  F)) 

vhere  #  is  the  standard  normal  cumulative  distrib-j^^on  function  and  (a) 
is  the  normal  deviate  at  the  a     quantile.    The  results  are  shown  in 
Figures  11. 4-- 3a,  3b  and  3c. 

The  major  impression  from  these  figures  Is  that  generally  the  same 
qualitative  conclusions  vill  be  drawn  from  tests  based  on  either  of  the  ARH 
or  meanparts  estimators.    As  can  be  seen,  although  the  ARM-based  subgroup 
effects  tend  to  be  smaller,  tests  based  on  the  ARM  estimates  do  not  appear 
to  be  markedly  conservative  relative  to  those  oased  on  the  meanparts 
estimates,  and,  if  anything,  appear  to  be  somewhat  liberal,  at  least  at 
Grade  8.    The  reason  that  the  hypothesis  tests,  based  on  the  ARH  plausible 
values,  are  not  markedly  more  conservative  than  those  based  on  the 
meanparts  estimates  is  that  t.  e  estimates  of  the  sampling  variability  of 
the  ARM  estimates  ir  also  smaller  than  the  corresponding  meanparts 
variability  estimates.    Figures  11.4-4a,  4b  and  4c,  which  show  the  ratio  of 
the  standard  errors  of  the  ARM  group  performance  estimate  to  the  standard 
iirror  of  the  meanparts  group  performance  estimate  plotted  against  the 
meanparts  group  performance  standard  error,  indicate  that  the  ARM  based 
standard  errors  are,  on  average,  around  ihree-fourths  the  ^ize  of  the 
equivalent  meanparts  estimates.    This  is  true  for  both  the  conditioned  and 
the  unconditioned  variables  and  also  holds  for  the  standard  errors  of  group 
effects.    The  ARM-based  standard  errors  tend  to  be  smaller  because  the  ARM 
estimators  use  the  available  information  about  the  relationship  between  the 
exercises  more  efficiently  than  do  the  meanparts  estimators.    Spe'!if ically, 
the  scores  of  an  individual  on  the  exercises  not  presented  to  that 
individual  are  partially  predictable  by  the  responses  to  the  exercises  that 
were  answered  by  the  individual.    This  means  that  each  person  provides  at 
least  some  information  about  each  of  the  writing  exercises  administered  to 
that  grade.    The  ARM  capitalisses  on  this  fact.    The  meanparts  estimator 
does  not  consider  this  information  and  consequently  has  a  larger  variance. 


473 


The  overall  conclusion  from  these  plots  is  that  the  general  effect  of 
the  bias  in  the  ARM-based  estimates  is  to  shrink  the  size  of  a  group  effect 
to  a  value  of  about  half  what  it  would  be  with  the  meanparts  estimate  but 
that,  after  taking  the  sampling  variability  into  account,  very  few 
qualitative  conclusions  would  be  changed  by  using  the  ARM-based  estimates 
rather  tnan  the  much  more  computationally  intensive  meanparts  estimates* 


ERIC 


474 


492 


Figure  11.4-3a 


GRADE  04 
F-VALUES  CONVERTED  TO  N(0,1) 


CSC 

< 


8- 
6- 
4- 

/ 

./ 

C       X  ^ 

/cot,  © 
/-'^  oo 

 7 

c 

0 

y  0  0 

0  - 

/ 

/  0 

/ 

/ 

-208468 


MEANPARTS  WRITING  SCORE 


1^ 


475 

493 


Figure  11.4-3b 


GRADE  08 
F-VALUES  CONVERTED  TO  N(0,1) 


-2  0  2  468 


MEANPARTS  WRITING  SCORE 


1^ 


476 

494 


Figure  11.4-3c 


GRADE  11 
F-VALUES  CONVERTED  TO  N(0,1) 


477 


1^ 


495 


Figure  11.4-4a 


tii 

0*4  • 


GRADE  4 
SE(A.R.M)  VS  SE(MEANPARTS) 


*  • 


it  • 


D*QO 


"H  1  I 

0*09  O'lD  D'15 

RATIO  OF  STANDARD  ERRORS 


0*3C 


478 


ERLC 


49b 


Figure  11.4-4b 


10- 


S  0'8 

! 

2 
2 

at  0'4- 

0-2 -4 


GRADE  8 
SE(A.R.M)  VS  SE(MEANPARTS) 


T 


T 


0*0S  G'lD  J'ld  0*3C 

STANDARD  ERROR  OF  MEANPARTS 


C*25 


479 


ERIC 


497 


Figure  11.4-4c 


GRADE  1 1 
SE(A.R.M)  VS  SE(MEANPARTS) 


•         i»  •  •  * 


iSfmt.     ••  •  • 


0.aO  O'M  C'lO  O'lS  O'SO  C-35 

STANDARD  ERROR  OF  MEANPARTS 


1^ 


480 

498 


Chapter  12 
BACKGROUND  AND  ATTITUDE  DATA  ANALYSIS 


Albert  E.  Beaton 
Norma  A.  Norris 
Janet  R.  Johnson 

Educational  Testing  Service 


The  ETS  design  of  NAEP  called  for  the  inclusion  of  a  large  number  of 
background)  attitude^  and  interest  questions  in  addition  to  the  usual 
cognitive  exercises  in  the  subject  areas  being  assessed*    Some  questions  of 
this  type^  such  as  the  student's  sex  and  levels  of  parents'  education^  had 
been  asked  in  past  assessments;  these  questions  were  continued  in  the  Year 
15  assessment.    ECS  supplied  to  ETS  a  large  number  of  questions  about 
teaching  and  learning  styles  and  habits,  which  were  also  included.  ETS 
added  a  lar^e  number  of  questions  which  might  be  useful  for  policy 
analyses.    The  result  is  a  very  rich  database  which  includes  not  only 
information  about  reading  and  writing  proficiency  but  hundreds  of  other 
variables  measuring  attributes  of  the  students,  their  schools,  and  their 
teachers.    A  summary  of  the  background  and  attitude  questions  is  presented 
in  Chapter  6. 

The  wealth  of  this  database  has  not  been  fully  explored  by  the  NAEP 
staff,  nor  should  it  be.    These  data  were  collected  for  secondary  analysis 
by  persons  interested  in  various  facets  of  education<^l  policy;  we  hope  that 
we  have  developed  a  database  sufficient  for  many,  varied  policy  analyses  by 
many  researchers.  The  NAEP  staff  has  devoted  its  energy  to  perform:lng  only 
those  analyses  necessary  for  the  reports  it  produced. 

The  trend  reports  have  been  particularly  limiting  because  they  are 
necessarily  restricted  to  variables  that  have  been  used  in  the  past. 
Basically,  these  are  the  reporting  subgroup  variables  of  sex, 
race/ethnicity,  region,  age,  grade,  size  and  type  of  community,  and  level 
of  parents'  education.    To  maintain  trend  analysis  capability,  we  have 
defined  variables  as  closely  as  possible  to  those  used  in  past  assessments. 

This  chapter  is  divided  into  two  sections: 


♦  Reporting  subgroups  and  derived  variables.  Section  12.1 
describes  the  reporting  subgroups  and  how  they  are  defined. 

♦  Other  derived  variables.  The  analysis  of  the  Year  15  (1983-84) 
writing  data  incorporated  a  number  of  the  questions  about 
writing  attitudes  and  practices.  In  summarizing  the  many 


481 


499 


quest ions y  several  scales  were  developed  using  factor  analysis 
and  the  WARM  scaling  method.  This  process  is  described  briefly 
in  Section  12.2. 


Hore  work  on  the  background  and  attitude  quest ions ,  as  veil  as  the 
generality  of  WARM  scales  and  their  properties,  will  continue  as  these 
questions  are  needed  for  specific  analyses, 

12*1   Reporting  Subgroups  and  Derived  Variables 

NAEP  reports  performance  results  for  groups  of  students  rather  than  for 
individual  students.    In  addition  to  reporting  national  results,  NAEP 
reports  information  about  student  subgroups  defined  by  sex,  race/ethnicity 
(both  observed  and  imputed),  region  of  the  country,  grade/age,  level  of 
parent's  education,  and  size  and  type  of  community. 

Some  subgroup  data  were  not  obtained  directly  from  assessment 
responses,  but  were  derived  through  procedures  described  in  Sections 
12.1.3,  12.1.6  and  12.1.7  below. 

Subgroup  data  are  contained  under  the  variable  names  listed  in  Table 
12(1). 

Table  12(1) 
Reporting  Subgroup  Variables 


Variable  Name 

Subgroup 

Student  File 

School  File 

Sex 

SEX 

Observed  Race/Ethnicity 

RACE 

Imputed  Race/Ethnicity 

ETHNIC* 

Region 

REGION 

SREGION 

Age 

STUDAGE* 

Grade 

NEWGRD 

Size  &  Type  of  Community 

STOC 

SSTOC 

Parent's  Education 

PARED* 

*  Denotes  derived  variable 


482 


ERIC 


500 


The  reporting  subgroups  were  determined  as  follows: 
12.1.1  Sex 

Responses  were  reported  for  male  and  female  students. 


12.1.2    Observed  Race/Ethnicity 

This  Is  the  race/ethnicity  of  the  student  being  assessed  as  observed 
by  the  exercise  administrator.    The  observed  definition  of  student 
race/ethnicity  was  the  only  one  used  in  NAEP  assessments  prior  to  Year  15. 
This  variable  should  be  used  for  race/ethn^city  subgroup  comparisons  to 
previous  assessments. 


12.1.3    Iroputed  Race/Ethnicity 

This  is  an  imputed  definition  of  race/ethnicity  of  the  student  being 
assessed,  derived  from  several  sources  of  informxtion.    This  variable  can 
be  used  for  race/ethnicity  subgroup  comparisons  within  the  Year  15 
assessment. 

Three  common  background  items  were  used  to  determine  race/ethnicity 
for  students  who  participated  in  the  Year  15  assessment  session.  The  items 
were  included  in  every  spiral  assessment  booklet  and  in  each  tape  booklet, 
as  follows: 

Common  Background  Item  Number  2; 
Are  you  Hispanic? 

A.  No 

B.  Yes,  Mexican,  Mexican  Araerican,  or  Chicano 

C.  Yes,  Puerto  Rinan 

D.  Yes,  Cuban 

E.  Yes,  Other  Spanish/Hispanic 
(What?)   


Students  who  responded  to  item  number  2  by  circling  B,  C,  D,  or  E  were 
considered  Hispanic.    For  students  who  circled  A,  did  not  respond  to  the 
item,  or  provided  information  which  was  illegible  or  which  could  not  be 
classified,  responses  to  item  number  1  were  examined  in  an  effort  to 
determine  race/ethnicity.    Item  number  1  read  as  follows: 

Common  Background  Item  Number  1: 
n    Are  you: 

A.  American  Indian  or  Alaskan  Native 

B.  Asian  or  Pacific  Islander 

C.  Black 

D.  White 

E.  Other  (What?)   


483 


501 


Students  who  circled  A  were  considered  American  Indian;  B  vere 
considered  Asian;  C  vere  considered  Black;  and  D  vere  considered  Vhite.  If 
a  student  responded  by  circling  E,  race/ethnicity  vas  determined  in 
accordance  vith  the  information  filled  in  by  the  student  as  **Other 
(What?).« 

For  students  vho  did  not  respond  to  item  number  1,  or  vho  did  so  by 
providing  illegible  information  or  information  vhich  could  not  be 
classified,  responses  to  item  number  4  vere  examined  in  an  effort  to 
d^'termine  race/ethnicity.    Item  number  4  read  as  follows: 

Commoifi  Background  Item  Number  4; 

7^    What  language  do  most  people  in  your,  home  speak? 

A.  English 

B.  Spanish 

C.  Another  language 

(What  is  it?)   


A  student  vas  considered  Hispanic  if  he  or  she  circled  B.    For  a 
student  vho  circled  C  and  indicated  that  most  people  in  the  home  spoke 
languages  vhich  vere  not  English  or  Spanish/Hispanic,  race/ethnicity  vas 
determined  by  classifying  the  language  specified  by  the  student. 

For  a  student  vho  did  not  respond  to  common  background  items  1,  2  or  4 
above,  observed  race/ethnicity,  if  provided  by  the  exercise  administrator, 
vas  used* 

Race/ethnicity  could  not  be  classified  for  a  student  vho  did  not 
respond  to  background  items  1,  2  or  4,  and  for  whom  an  observed 
race/ethnicity  vas  not  provided. 

The  races  and  ethnicities  vhich  vere  provided  by  students  in  response 
to  items  1,  2  and  4  above  are  listed  in  Table  12(2).    Slashes  indicate 
variations  in  the  vay  races  and  ethnicities  vere  spelled  by  students. 

Table  12(3)  summarizes  the  procedure  used  to  determine  race/ethnicity. 


12.1.4    Size  and  Type  of  Community 

NAEP  assigned  each  participating  school  to  one  of  seven  Size  and  Type 
of  Community  (STOC)  categories.    The  categories  vc/e  designed  to  provide 
information  about  the  communities  in  vhich  the  Softools  vere  located. 

The  STOC  reporting  categories  consist  of  three  '^extreme'*  types  of 
communities  and  four  ''residual'*  community  sizes.    Schools  vere  placed  into 
STOC  categories  based  upon  information  about  the  type  of  community,  the 
size  of  its  population  and  upon  an  occupational  profile  of  residents 
provided  by  school  principals.    The  principals  completed  estimates  of  the 
percentage  of  students  vhose  parents  fit  into  each  of  six  occupational 
categories. 


484 


5U2 


Table  12(2) 
Race/Ethnlclty  Classifications 


AMrican  Indian: 
A«erican  Indian 
Cherokee 
IndianaMerican/ 
Indianerican 
Nativean 
Navahoe 
Sueinda 

Asian: 
Aaerasian/Amasian 
Anericanphillippine 
Asian 
Assirian 
Cambodian 
Chinese 
Eastindi 
Eurasian 

Filippine/Fillippine/ 

Filappine/ 

Phillipine 
Fr  India 
Guamania 
India 

Indianasian 

Indonesian 

Japanese 

J  apanese-Amer  i  can 
Korean 

Lasos/Leocean/Laotion/ 
Loas 
Oriental 
Pacific 
Pakistan 

Ta  i  vanes /Tao vames 
Thai/Thia 

Vietnamese/Vei  tnamese 

Blacks 
Afroamrican 
Black 

Blackanerican 

Hispanics 
Cnicano 
Columbian 
DoMinican 
Hispanic 
Latin 
Mexican 
Puerto  Rican 
Salvador i an 
Spanish 

Whites  Unclassified: 
Appropriate  races/ethnicltles  were  classified  as 
White.    Races/ethnicities  which  could  not  be  con- 
sidered American  Indian,  Asian,  Black,  Hispanic 
or  White  were  included  as  unclassiiied. 

485 

ERLC 


Table  12(3) 
Determining  Race/Ethnicity 


Background  I tea  Nuaber  2 
2.  Are  you  Hispanic? 

A.  No 

B.  Tea,  Mexican,  Mexican  American 
or  Chicano 

C.  Yes,  Puerto  Rican 

D.  Yes,  Cuban 

£•  Yes,  Other  Spanish/Hispanic 
(What?)   


Student  circled  A,  did  not  respond,  I 
provided  either  illegible  response  or  | 
response  vhich  could  not  be  classified  I 


Background  I ten  Number  1 
!•  Are  you: 

A.  American  Indian  or  Alaskan  Native 

B.  Asian  or  Pacific  Islander 

C.  Black 
0.  White 


Student  did  not  circle  A,B,C  or  D;  I 
provided  either  illegible  response  or  | 
response  vhich  could  not  be  classified  I 


Background  Item  Number  1 
B.  Other  (What?)   


Student  did  not  circle  B,  provided  I 
either  illegible  response  or  | 
response  vhich  could  not  be  classified  i 


Background  Item  Number  4 
4.  What  language  do  most  people  in  your 
home  speak? 

A.  English 

B.  Spanish 

€•  Another  language 

(What  is  it?)   


Student  circled  A,  did  not  respond, 
provided  either  illegible  response  or 
response  vhich  could  not  be  classified 


Observed  Race/Ethnicity 


Observed  Race/Bthnicity  vas  not  | 
provided  by  Exercise  Administrator  i 


Unclassified  Race/Ethnicity 


Student  circled 
B,  C,  D  or  E  — 


Student 
vas 

Hispanic 


r 


Student  circled 
A,  B,  C  or  D  — 


Student  vas: 

A.  American 
Indian, 

B.  Asian, 

C.  Black,  or 

D.  White 


Student  filled-in  another 
Race/Ethnicity   


Student  vas: 
American 
Indian, 
Asian, 
Black, 
White;  or 
Hispanic 


Student  circled  B  or  circled  C 
and  filled-in  a  language   4 


Student  vas: 

B. 

Hispanic, 

c. 

American 

Indian, 

Asian, 

Hispanic, 

Black,  or 

White 

Provided  by 

Exercise  Administrator 


Student  vas: 

American 

Indian, 

 > 

Asian, 
Black, 

Hispanic,  or 

White 

ERIC 


486 

504 


4-  aII     l<       extreae  rural  and  low  or  high  .etropolltan  areas  were  ranked 
in  dascanding  order  according  to  the  occupational  profile,  the  type  of 
cowwinity,  and  the  size  of  its  population.    The  top  10  percent  of  these 
schools  ware  assigned  to  the  extreM  STOC  categories  (1,  2  and  3)  below. 
SJU"^*^     "?  schools  were  classified  according  to  one  of  the  four  residual 
5T0C  categories.    The  three  extreae  STOC  categories  are  as  follows: 


STOC  1  -  Extreme  Rural; 


This  category  w"is  used  for  schools  in  rural  areas 
where  a  high  proportion  of  adults  were  faraers  or  fara 
workers  and  a  low  proportion  of  professional, 
■anagerial,  or  factory  workers.    At  least  soae  of  the 
students  in  these  schools  were  froa  open  country  or 
places  with  a  population  of  less  chan  10,000. 


STOC  2  -  Lov  Metro: 


The  low  Metro  STOC  category  was  used  for  schools  in 
ar«as  where  a  high  proportion  of  the  adult  population 
was  either  not  regularly  employed  or  on  welfare  and  a 
low  proportion  was  eaployed  in  professional  or 
unagerial  positions.    The  schools  in  STOC  2  were 
located  in  cities,  or  the  urbanized  area  of  cities,  with 
a  population  greater  than  200,000. 

STOC  3  -  High  Metro: 

High  Metro  schools  were  located  in  city  areas  where 
a  high  proportion  of  adults  was  eaployed  in  professional 
or  Managerial  positions  and  a  low  proportion  factory  or 
fara  workers,  aot  regularly  eaployed,  or  on  welfare. 
STOC  3  schools  were  located  in  cities  or  the  urbanized 
area  of  cities  with  populations  greater  than  200,000. 

Schools  which  did  not  fall  into  STOC  1,  2  or  3  were  classified 
according  to  four  "residual-  STOC  categories  depend'.ng  upon  the  size  of  the 
coMMunity  in  which  they  were  located.    The  four  residual  STOC  reportinji 
categories  are  as  follows:  * 


STOC  4  -  Main  Big  City: 

STOC  4  schools  were  located  within  the  limits  of 
cities  with  populations  greater  than  200,000  hut  not 
classified  as  High  or  Low  Metro. 

STOC  5  -  Urban  Fringe: 

The  schools  assigned  to  STOC  5  were  located  in  the 
urbanized  area,  but  outside  the  limits,  of  cities  with 
populations  over  200,000,  but  not  classified  as  Low  or 
High  Metro. 


487 


5U5 


STOC  6  -  Medium  City; 


STOC  6  schools  were  located  in  cities  with 
populations  of  between  25,000  and  200,000  which  did  not 
classify  as  fringe  areas  for  big  cities. 

STOC  7  -  Small  Place: 

The  schools  assigned  to  STOC  7  were  located  in 
communities  with  populations  of  less  than  25,000.  These 
communities  were  not  located  in  the  urbanized  areas  of 
big  cities  and  could  not  be  classified  as  Extreme  Rural. 


12.1.5  Region 

In  addition  to  overall  responses,  NAEP  computed  data  for  four 
geographical  regions  in  the  United  States.    Table  12(4)  outlines  the 
assignment  of  individual  states  to  each  region. 


Table  12(4) 
Geographic  Regions 


NORTHEAST: 

SOUTHEAST; 

Connecticut       New  Hampshire 
Delaware           New  Jersey 
District           New  York 

of  Columbia  Pennsylvania 
Maine                Rhode  Island 
Maryland  Vermont 
Massachusetts 

Alabama  Mississippi 
Arkansas       North  Carolina 
Florida        South  Carolina 
Georgia  Tennessee 
Kentucky  Virginia 
Louisiana     West  Virginia 

CENTRAL: 

WEST: 

Illinois  Missouri 
Indiana  Nebraska 
Iowa                 North  Dakota 
Kansas  Ohio 
Michigan           South  Dakota 
Minnesota  Wisconsin 

Alaska         New  Mexico 
Arizona  Oklahoma 
California  Oregon 
Colorado  Texas 
Hawaii  Utah 
Idaho  Washington 
Montana  Wyoming 
Nevada 

12.1.6    Parental  Education 

Students  were  asked  to  indicate  the  extent  of  their  father's  educat 
in  one  of  the  follo^ring  ways: 

(1)    He  did  not  finish  high  school; 


488 


(2)  He  graduated  from  high  school; 

(3)  He  went  to  another  school  after  graduating 
high  school; 

(4)  He  graduated  from  college;  or 

(5)  I  Don't  Know. 


Students  were  asked  to  provide  the  same  information  about  the  extent  of 
r  mother's  education  by  checking  one  of  the  following: 

(1)  She  did  not  finish  high  school; 

(2)  She  graduated  from  high  school; 

(3)  She  went  to  another  school  after  graduating 
high  school; 

(4)  She  graduated  from  college;  or 

(5)  I  Don't  Know. 


The  information  was  combined  into  one  narental  education  reoortinir 
category,  as  follows:  * 

If  a  student  indicated  the  extent  of  education  for  either  parent,  the 
higher  of  the  two  levels  was  included  in  the  data.    If  a  student  indicated 
that  he  or  she  did  not  know  the  level  of  education  for  both  parents  or 
indicated  that  he  or  she  did  not  know  the  level  of  education  for  one  parent 
and  did  not  respond  for  the  other,  the  parental  education  level  was 
classified  as  unknown.    If  the  student  did  not  respond  for  either  parent, 
the  student  was  recorded  as  providing  no  response. 


12.1.7  Grade/Age 

To  enhance  the  utility  of  assessment  data,  NAEP  began  sampling  students 
by  grade  as  well  as  age  during  the  Year  15  assessment.    As  a  result.  Year 
13  data  reflect  the  following  grade/age  classifications: 

Grade  4/Age  9  Students; 

For  the  Grade  4/Age  9  sample,  age  was  computed  as 
of  December  31,  1983.    The  sample  includes  many  students 
who  were  both  in  grade  4  and  age  9.    However,  because 
NAEP  collected  data  by  grade  or  age  during  the  Year  15 
assessment,  the  Grade  4/Age  9  sample  also  includes 
students  who  were  age  9  (born  in  1974)  but  not  in  grade 
4,  students  who  were  in  grade  4  but  age  8  or  younger 
(born  in  or  after  1975),  and  students  who  were  in  grade 
4  but  age  10  or  older  (born  in  or  before  1973). 

Grade  8/Age  13  Students: 

For  the  Grade  8/Age  13  sample,  age  was  computed  as 
of  December  31,  1983.    The  sample  includes  many  students 
who  were  both  in  grade  8  and  age  13.    However,  because 
NAEP  collected  data  by  grade  or  age  during  the  Year  15 


489 


507 


assessment y  the  Grade  8/ Age  13  sample  also  includes 
students  who  were  age  13  (born  in  1970)  but  not  in  grade 
8,  students  who  were  in  grade  8  but  age  12  or  younger 
(born  in  or  after  1971),  and  students  who  were  in  grade 
8  but  age  14  or  older  (born  in  or  before  1969). 

Grade  11/Age  17  Students: 

For  the  Grade  11/Age  17  sample,  age  was  computed  as 
of  September  30,  1984,    The  sample  includes  many 
students  who  were  both  in  grade  11  and  age  17.  However, 
because  NAEP  collected  data  by  grade  or  age  during  the 
Year  15  assessment,  the  Gr?.de  11/Age  T7  sample  also 
includes  students  who  were  age  17  (born  between  October 
1,  1966  and  September  30,  1967)  but  not  in  grade  11, 
students  who  were  in  grade  11  but  age  16  or  younger 
(born  after  September  30,  1967),  and  students  who  were 
in  grade  11  but  age  18  or  older  (born  before  October 
1966). 


12,2    Other  Derived  Variables 

The  analysis  of  the  Year  15  writing  data  included  not  only  the  analysis 
of  responses  to  the  cognitive  writing  exercises,  but  the  analysis  of  the 
responses  to  over  100  non--cognitive  questions  about  the  students'  attitudes 
toward  writing,  their  writing  practices,  their  writing  assignments,  and  the 
kind  of  instruction  and  help  they  received  from  their  teachers.    The  many 
questions  were  reduced  to  a  few  scales  using  components  analysis  and  the 
weighted  average  response  method  (WARM)  of  scaling. 

First,  a  principal  components  analysis  with  Varimax  rotation  was 
performed  to  explore  the  dimensionality  of  the  writing  attitude  and 
activity  questions.    Eleven  components  seemed  to  fit  Ihe  data  adequately 
and  conform  to  the  theory  that  led  to  these  questions. 

Using  th^fse  components  as  a  guide,  eleven  weighted  sums  of  the  items 
were  defined  as  summary  scales.    The  weighted  average  response  method  was 
used  to  estimate  plausible  values  on  each  background  scale  for  each  student 
who  had  answered  at  least  one  of  the  items  composing  the  scale.  The 
weighted  average  response  method  is  an  extension  of  the  average  response 
method  (ARM)  which  is  discussed  in  Chapter  11.4. 

A  detailed  description  of  these  writing  background  variables  is 
contained  in  the  Procedural  Appendix  of  the  cross-sec* ional  report.  The 
Writing  Report  Card;    Writing  Achievement  in  American  Schools,  1984 
(Applebee,  Langer,  &  Mullis,  1986b). 


490 


Chapter  13 
PARAHETER  ESTIMATION 

Albert  E.  Beaton 
Educational  Testing  Service 


Given  the  reading,  writing,  and  background  and  attitude  information 
discussed  in  the  preceding  chapters ,  we  can  now  examine  what  students  in 
American  schools  can  and  cannot  do.    This  chapter  describes  the  process  by 
which  estimates  of  performance,  for  the  nation  as  a  whole  and  for  selected 
subpopulations,  were  made  and  how  the  errors  of  estimation  were  produced 
This  chapter  covers  four  topics: 

*  The  weighting  procedures.    For  any  sample  in  which  the  members 
have  different  probabilities  of  selection,  it  is  important  to 
compute  sampling  weights  for  each  individual.    The  sampling 
weights  are  used  to  make  estimates  of  the  parameters  of  the 
population.    The  NAEP  sampling  weights  were  carefully 
computed,  using  information  derived  from  the  NAEP  sampling 
frame,  from  the  actual  NAEP  data,  from  the  Current  Population 
Survey,  and  from  Census  Reports.    The  weights  were  developed 
by  Westat,  Inc.;  the  process  is  reported  in  Chapter  13.1. 

*  The  estimation  of  uncertainty  due  to  sampling  variability. 
Each  population  estimate  is  to  some  degree  imprecise  because 
it  is  derived  from  a  sample,  and  it  is  important  to  be  aware 
of  the  probable  magnitude  of  the  imprecision  when  interpreting 
the  results.    With  each  parameter  estimate,  we  have  also 
produced  an  estimate  of  its  sampling  error  using  the  jackknife 
method  (see  Hosteller  &  Tukey,  1969).    The  application  of  the 
jackknife  to  the  NAEP  data  is  reported  in  Chapter  13.2. 

*  The  estimation  of  variability  due  to  imputation.    Since  the 
NAEP  sample  was  designed  to  estimate  population  parameters 
rather  than  individual  proficiencies,  individual  proficiencies 
can  not  be  estimated  precisely.    The  imprecision  of  the 
individual  estimates  results  in  some  additional  uncertainty  in 
the  estimates  of  parameters.    This  additional  uncertainty  can 
be  estimated  separately,  and  this  component  of  error  variance 
can  be  added  to  the  error  variance  due  to  sampling  for  a 
combined  estimate  of  the  uncertainty  of  parameter  estimation. 
The  steps  in  this  process  are  reported  in  Chapter  13.3. 


491 


ERIC 


509 


r 


*     The  production  of  the  basic  tables  of  NAEP  results.    The  basic 
tables  consist  of,  among  other  things,  estimates  of  the  sizes 
of  various  subpopulations,  the  errors  in  estimating  the  sizes 
of  subpopulations,  the  proportion  of  students  able  to  answer 
each  item  correctly  and  their  standard  errors,  and  the  average 
reading  or  writing  proficiency  of  various  subpopulations  of 
students  and  their  standard  errors.    Some  of  the  tables 
contain  trend  data,  that  is,  they  compare  the  Year  15  data 
with  data  from  past  reading  or  writing  assessments. 

These  tables  were  designed  to  be  informative  and  easy  to  use 
for  the  NAEP  staff.    They  were  developed  over  time  as  the 
staff  became  aware  of  more  useful  ways  of  presenting  the  data. 
Books  of  these  tables,  referred  to  as  almanacs,  were  used  as 
the  first  step  in  interpreting  the  data  and  serve  as  reference 
documents.    The  contents  of  the  almanacs  and  their  use  are 
discussed  in  Chapter  13.4. 

The  almanacs  are  far  too  voluminous  to  be  included  in  this 
report  or,  indeed,  to  be  made  available  to  the  public  at 
largi  .    Chapter  13.4  reports  that  approximately  10,000  tables 
have  been  collected  into  24  books,  but  many  more  have  been 
developed. 

The  statistical  tables  do  not  exhaust  the  parameter  estimation 
procedures  used  for  NAEP.    Additional  methods  are  discussed  in  the  NAEP 
reports  for  which  they  were  used. 


492 


ERJC  510 


Chapter  13.1 
WEIGHTING  PROCEDURES 

Eugene  G.  Johnson 
Educational  Testing  Service 

Morris  H.  Hansen 
Benjamin  J.  Tepping 
Josefina  A.  Lago 
John  Burke 

Westat,  Inc. 


As  is  the  case  in  many  large  scale  sample  surveys,  the  Year  15  NAEP  has 
a  complex  sample  design.    The  goal  of  this  design  was  a  sample  from  which 
estimates  of  population  and  subpopulation  characteristics  could  be  obtained 
with  reasonably  high  precision  (as  measured  by  low  sampling  variability). 
Additionally,  it  was  necessary  that  the  sample  be  economically  and 
operationally  feasible  to  obtain. 

To  accomplish  this  goal,  the  NAEP  used  a  multi-stage  cluster  sample 
design  (see  Chapter  5)  in  which  the  probabilities  of  selection  of  the 
first-  and  second-stage  sampling  units  (PSUs  and  schools)  were  proportional 
to  measures  of  their  size,  but  with  probability  for  subsequent  stages  of 
sampling  such  that  the  overall  probabilities  of  selection  of  students  were 
approximately  uniform,  with  exceptions  for  certain  population  subclasses 
that  were  oversampled  by  design.    This  oversampling  was  done  to  ensure 
adequate  precision  in  the  estimation  of  characteristics  of  the  various 
subpopulations  of  interest.  Students  in  the  extreme  rural  areas  and  in  the 
extreme-low-SES  areas  of  big  cities  were  deliberately  sampled  at  twice  the 
normal  rate  to  obtain  larger  samples  of  respondents  from  those 
subpopulations.    The  result  of  these  differential  probabilities  of 
selection  is  an  achieved  sample  containing  proportionately  more  members  of 
these  subgroups  than  there  are  in  the  population. 

Appropriate  estimation  of  population  characteristics  must  take  this 
disproportional  representation  of  the  various  subgroups  in  the  sample  into 
account.    This  is  accomplished  by  assigning  a  weight  to  each  respondent, 
where  the  weights  properly  account  for  the  sample  design  and  reflect  the 
appropriate  proportional  representation  of  the  various  types  of  individuals 
in  the  population. 


493 


511 


This  chapter  provides  an  overview  of  the  weighting  procedures  used  for 
the  Year  15  assessment  and  includes  the  estimation  of  base  weights, 
adjustment  for  nonresponse,  trimming  of  large  weights,  and 
post-stratification  adjustments.    Further  details  of  these  tasks  can  be 
found  in  the  Westat  Report  on  Sample  Selection,  Weighting,  and  Variance 
Estimation;    NAEP— Year  15  (Lago,  Burke,  Tepping,  &  Hansen,  1985).  Westat, 
Inc.  was  the  subcontractor  responsible  for  these  tasks. 


13.1.1    Computation  of  the  Base  Weight 

The  starting  point  for  the  estimation  of  respondent  weights  is  the 
classical  (Horvitz-Thompson)  procedure  in  which  the  weight  assigned  to  a 
respondent  is  the  reciprocal  of  the  overall  probability  that  the  respondent 
was  selected  for  assessment.    Since  this  weight  is  the  basis  of  the  final 
respondent  weight,  it  is  called  the  base  weight. 

The  base  weight  assigned  to  a  student  is  the  reciprocal  of  the 
probability  that  the  student  was  invited  to  a  particular  type  of  assessment 
session;  that  is,  a  spiral  session  or  a  particular  tape  session.  That 
probability  is  the  product  of  four  factors: 


(1)  The  probability  that  the  PSU  was  selected; 

(2)  the  conditional  probability,  given  the  PSU,  that  the  school 
was  a  member  of  the  sample  selected  by  RTI  or  any 
supplementary  sample  selected  by  Westat; 

(3)  the  conditional  probability,  given  the  sample  of  schools  in  a 
PSU,  that  the  school  was  allocated  the  specified  type  of 
session;  and 

(4)  the  conditional  probability,  given  the  school,  that  the 
student  was  invited  to  the  specified  type  of  session. 

Thus,  the  base  weight  for  a  student  may  be  expressed  as  the  product 


=  PSU  weight, 

Wj  «  school  weight,  conditional  on  the  PSU, 

Wj  «  the  reciprocal  of  the  conditional  probability  given 
the  sample  of  schools,  that  the  school  is  allocated 
a  specified  type  of  session,  and 


W  =  Wj^  .  Wj  •  Wj  .  W^ 


where 


494 


>  the  reciprocal  of  the  withln-school  selection 
probability  for  students  sampled  for  spiral  or  a 
specific  tape  session. 


The  PSU  weight,      ,  was  provided  by  RTI,  the  survey  subcontractor 
of  the  previous  grantee  (ECS),  and  is  the  reciprocal  of  the  probability  of 
selection  of  the  PSU.    The  selection  probability  of  a  given  PSU  was 
proportional  to  the  PSU's  adjusted  measure  of  size  which  is  ordinarily  the 
estimated  average  enrollment  of  the  three  age  classes.    For  counties 
identified  as  extreme  rural,  the  measures  of  size  were  doubled.  For 
big-city  PSUs,  the  adjusted  measure  of  size  was  derived  as  the  weighted 
mean  of  the  estimated  enrollments  for  low  socio-economic  status  tracts  and 
for  the  remainder  of  the  PSU,  with  the  low  socio-economic  status  tracts 
given-twice  the  weight  of  the  remainder.    These  adjustments  were  designed 
to  gffect  oversampling  for  those  counties  and  tracts. 

The  school  weight,  W^,  is  the  reciprocal  of  the  conditional  probability 
of  selection  of  the  school,  given  the  selection  of  the  PSU  containing  the 
school.    This  probability  of  selection  is  proportional  to  an  adjusted 
measure  of  size  for  the  school  which  is  related  to  the  estimated 
number  of  age-eligible  students  within  the  school.    Roughly  equal  measures 
of  size  were  assigned  to  schools  containing  an  estimated  number  of 
age-eligible  students  ranging  from  20  to  160  (for  Age  9)  or  20  to  200  (for 
Ages  13  and  17).    Schools  with  fewer  than  20  age  eligibles  were  assigned 
smaller  measures  of  size  and  schools  above  the  indicated  maximums  were 
assigned  larger  measures  of  size  which  were  proportional  to  the  number  of 
age  eligibles  in  the  school.    If  the  school  was  designated  as  a  member  of 
the  low-SES  stratum  of  a  big  city,  the  size  measure  of  the  school  was 
adjusted  by  doubling. 

The  session  allocation  weight,  W3,  was  computed  by  enumeration  of  all 
possible  allocations  yielded  by  the  algorithm  used  by  Westat  to  allocate 
tape  and  spiral  sessions  to  sample  schools  or  school  clusters. 

For  spiral  sessions,  the  wi thin-school  student  weight,  W  ,  is  simply 
the  sampling  interval  for  selecting  students  for  spiral  sessions.    For  tape 
sessions,  the  wi thin-school  student  weight  accounts  for  whether  or  not 
there  was  spiral  sampling  in  the  school  and  the  conditional  sampling 
interval  for  tape. 


13.1.2    Adjustment  of  Base  Weights  for  Nonresponse 

The  base  weight  for  a  student  was  adjusted  by  two  nonresponse  factors: 
one  to  adjust  for  non-cooperjiting  schools  and  the  second  to  adjust  for 
students  that  were  invited  tc»  the  assessment  but  did  not  appear  either  in 
the  scheduled  session  or  in  a  makeup  session.    Thus,  the  within-PSU 
nonresponse  adjusted  weight  was  of  the  form 

W    =  W  f  W  W  f 

w         2    13   4  2 


495 


ERIC 


513 


where  the  nonresponse  adjustment  factors,       and  f^,  were  computed  as 
described  below. 

The  practical  consequence  of  the  nonresponse  adjustments  to  the  weights 
is  that  the  distributions  of  characteristics  of  the  pool  of  nonrespondents 
within  a  nonresponse  class  within  a  PSU  are  implicicly  assumed  to  be  the 
same,  on  average,  as  the  equivalent  distributions  for  the  respondents 
within  the  same  class  within  the  PSU,    That  is,  within  classes  the  causes 
of  nonresponse  are  in  effect  assumed  to  be  ignorable  so  that,  after 
appropriate  adjustments  of  the  weights,  the  pool  of  respondents  can  be 
fairly  considered  as  j,  representative  sample  of  the  total  population  of 
students. 


13,1,3    School  Nonresponse  Adjustment 

A  school  nonresponse  adjustment  was  applied  to  the  base  weight  of 
students  in  spiral  but  not  tape  sessions,  as  the  four  required  tapes  per 
PSU  were  always  allocated  to  cooperating  schools  in  a  PSU,    As  a  result, 
only  weights  for  spiral  sessions  were  affected  by  school  nonresponse. 

School  nonresponse  factors  were  computed  separately  within  each  PSU  for 
one,  two,  or  three  classes  of  schools  using  as  many  nonresponse  classes  as 
the  number  of  sampled  schools  in  the  PSU  and  nonresponse  patter.i  allowed. 
However,  since  each  class  was  required  to  contain  at  least  four  or  five 
schools,  often  only  one  class  was  identified  in  the  PSU, 

For  any  school  nonresponse  class,  s,  the  school  nonresponse  factor  for 
spiral  sessions  is  given  by 


ieA  ' 
I.        E     W  G 
icB  ^ 


school  weight  (the  reciprocal  of  the  probability  of 
selection  of  the  school  conditional  on  the  PSU), 

estimated  number  of  grade-eligible  students  in  school  i 
based  on  QED  data  and/or  the  Principal  Questionnaire, 

consists  of  the  original  sample  of  eligible  schools  in 
class  s  (including  supplemental,  new,  and  refusing 
schools,  but  not  substitutes  (as  defined  in  Chapter  4, 
Section  4,3]),  and 

consists  of  all  cooperating  schools  in  class  s  (including 
schools  that  were  substituted  for  non-cooperating 
schools), 

496 


ErJc  "^^14 


where 


set  A 


set  B 


Note  that  for  a  substitute  school,  W     is  the  weight,  based  on  the 
measure  of  size,  which  would  have  been  used  if  the  school  had  been  selected 
by  the  original  probability  selection  procedure* 


13.1.4  Student  Nonresponse  Adjustment 

Student  nonresponse  adjustment  factors  were  computed  separately  for 
spiral  sessions  and  for  each  of  the  four  tape  sessions  within  each  PSU* 

13.1.5  Nonresponse  Adjustment  for  Students  in  Paced  Tape  Sessions 

For  each  tape  session,  t,  in  a  PSU,  the  nonresponse  factor  f..  (>  1) 
was  computed  by  zt  - 


n 

£  .-4- 

2t  n' 
t 

where 

n^  «     number  of  students  invited  to  the  particular  tape 
session  in  the  PSU,  and 

n^    «    number  of  students  who  completed  the  session. 

Note  that  in  the  common  situation  where  all  students  invited  to  a  tape 
sessions  were  from  a  single  school,  no  school  weight  (such  as  appears  below 
in  the  adjustment  factor  for  spiral  sessions)  is  needed  to  compute  the 
nonresponse  adjustment  factor;  the  weighted  ratio  equals  the  unwei  hted 
ratio.    In  the  occasional  situation  where  a  school  cluster  was  involved,  it 
would  have  been  appropriate  to  introduce  the  school  weight  in  the 
adjustment.    This  was  not  done  because  of  the  infrequent  occurrence  of 
school  clusters,  and  because  the  aggregate  effect  of  applying  the  school 
weights  in  such  cases  would  have  been  only  marginally  different  from  the 
adopted  procedure  of  using  the  ratio  n^/n^  . 

13.1.6    Nonresponse  Adjustment  for  Students  in  Spiral  Sessions 

For  spiral  sessions,  the  student  nonresponse  adjustment  was  made 
separately  for  two  classes  of  students:    those  in  or  above  the  modal  grade 
for  their  age  and  those  below  the  modal  grade. 


497 


erIc  '.'^^5 


The  factor  for  students  in  class  c  in  a  particular  PSU  was  computed  by 


f 


2  c 


where  the  summations  extend  over  the  schools  in  the  PSU,  and 


n 


ic 


number  of  spiral  invited  students  in  school  i  and 
student  class  c. 


ic 


number  of  spiral  tested  students  in  school  i  and 
student  class  c,  and 


V 


i 


13.1.7    Adjustment  for  Missing  Tape  Sessions 

In  a  few  instances,  the  supervisor  inadvertently  administered  spiral 
booklets  rather  than  the  assigned  tape  booklet.    Or,  a  school  which  was 
allocated  a  tape  session  refused  just  before  the  assessment  was  conducted^ 
without  providing  enough  time  to  reassign  the  tape  session  to  another 
school.    This  problem  occurred  in  seven  of  the  768  tape  sessions  assigned 
to  the  three  grade/age  groups. 

The  following  imputation  procedure  was  used  to  deal  with  this  type  of 
nonresponse.    For  variance  computation  purposes,  the  64  NAEP  PSUs  had  been 
grouped  into  32  pairs.    Let  the  PSU  requiring  the  imputation  be  called  the 
"recipient"  PSU  and  the  other  member  of  the  same  pair  the  "donor"  PSU.  A 
one-half  subsample  of  the  students  administered  the  particular  tape  session 
in  the  donor  PSU  was  transferred  to  the  recipient  PSU.    The  weights  of 
students  involved  in  the  imputation  were  adjusted  as  follows: 

The  students  that  remained  in  the  donor  PSU  had  their  overall  weight 
doubled  by  doubling  the  within-school  student  weight  since  this  one-half 
subsample  also  represented  those  students  transferred  to  the  recipient  PSU. 

The  overall  weight  for  students  in  the  recipient  PSU  was  the  product  of 
its  original  PSU  weight,  the  other  three  weights,  and  the  student 
nonresponse  adjustment  carried  from  the  donor  PSU.    The  weight  associated 
with  the  allocation  of  the  particular  tape  session,  the  doubled 
within-school  weight,  and  the  student  nonresponse  adjustment  were  carried 
without  modification  from  the  donor  PSU.    To  obtain  the  school  weight  of 
the  recipient  school,  the  school  weight  of  the  donor  school  was  adjusted  by 
the  ratio  of  the  donor  PSU  weight  to  the  recipient  PSU  weight;  that  is: 


498 


ERLC 


516 


V 


V 


ID 


V 


V 


2D 


13.1.8   Irl—lng  Extreaely  Large  Weights  to  Reduce  Mean  Squared  Error 

In  a  nuMber  of  cases,  students  were  assigned  extremely  large  weights. 
One  cause  of  large  weights  was  under-estimation  of  the  number  of  eligible 
studen'4:8  in  the  school  so  that  a  school  predicted  to  have  a  small  number  of 
eligible  students  on  the  bayis  of  QED  data  (and  hence  a  lower  probability 
of  selection)  in  fact  had  a  large  number  of  students.    Other  extremely 
large  weights  arose  as  the  result  of  high  levels  of  nonresponse  coupled 
with  low  to  moderate  probabilities  of  selection. 

Students  with  extremely  large  weights  have  an  unusually  large  impact  on 
estimates  such  as  weighted  means.  Since  the  variability  in  weights 
contributes  to  |be  variance  of  an  overall  estimate  by  an  approximate  factor 
1  ♦  V  ,  where  V    is  the  relvariance  of  the  weights,  a  few  extremely  large 
weights  are  likely  to  produce  large  sampling  variances  of  the  statistics  of 
interest,  especially  when  the  large  weights  are  associated  with  students 
with  atypical  performance  characteristics. 

All  students  responding  to  a  given  type  of  assessment  (i.e.,  spiral  or 
one  of  the  four  tape  assessments)  within  a  given  school  receive  the  same 
weight.    Consequently,  extremely  large  weights  come  in  groups  corresponding 
to  students  in  a  given  school.    To  reduce  the  effect  of  large  contributions 
to  variance  from  a  small  set  of  sample  schools,  the  weights  of  such  schools 
were  reduced,  that  is,  trimmed  back.    (We  call  this  "weight  trimming" 
although  a  more  proper  name  would  be  "weight  Winsorizing*  to  be  consistent 
with  the  current  terminology  from  robust  and  resistant  statistics.) 
Following  this  procedure  introduces  a  bias,  but  ij  expected  to  reduce  the 
mean  square  error  of  sample  estimates. 

The  trimming  algorithm  has  the  effect,  approximately,  of  trimming  the 
weight  of  any  school  that  contributes  more  than  a  specified  proportion,  9, 
to  the  estimated  variance  of  the  estimated  number  of  students  in  the 
population.    The  trimming  was  done  separately  for  the  spiral  assessment  and 
for  each  of  the  four  tape  assessments.  Let 


>    number  of  schools  in  which  the  assessment  was  done, 

-    weight  assigned  to  school  i  (i.e.,  the  product  of  the  PSU 
weight, the  school  weight,  the  school  session  weight  and 
the  school  nonresponse  factor—W^WjWjf^), 

«    estimated  aumber  of  age-eligible  students  in  school  i 
(i.e.,  the  sun  of  the  wi thin-school  weights  for  the 
students  assessed— W^ f,  summed  across  the  students 
assessed) , 


499 


517 


x*     •   W'x'    «   nuaber  of  students  in  the  population  represented 
^  ^  ^        by  the  school,  and 

X"     -    (1/M)  Ex-  (1/M)(esti«ated  total  nuaber  of 

*       age  eligibles  in  the  population). 


A  rough  approximation  to  the  variance  of  x"  is 


Vestat  adopted  a  tricing  method  that  reduced  the  weight  W'  for  a 
•■all  nuabar  of  schools  in  such  a  manner  that  no  school  makes  a 
contribution  to  the  sum  shown  above  that  is  greater  than  a  specified 
proportion  e,  where     is  to  be  determined.    That  is,  for  any  school  j,  the 
weight  W,  after  all  weights  have  been  trimmed  if  required,  satisfies  the 
conditio^ 

(xj  -  X-)'  <  e  V 

where 

V   .    E^(xj[  -  x")'  is  the  between  school  sum  of  squares. 

Because  only  large  weights  are  to  be  trimmed,  the  weight  is  not  to  be 
altered  if       <  x". 

Bquivalently,  the  condition  on  the  school  weight  WJ  is 
i"    ♦    /TV"  j 


W  <  — r 
j  x^ 


The  trimming  was  done  iteratively.    Using  the  initial  weights,  the 
weight  for  each  school  which  failed  to  satisfy  the  inequality  was  reduced 
to  the  value  given  by  the  right-hand  side  of  the  inequality.    Using  the 
weights  as  trined,  the  procedure  was  iterated. 

To  determine  a  value  of  6,  the  schools  in  each  sample  were  listed  in 
descending  sequence  according  to  the  value 

-  (xJ    -  X")'  /  V  . 


500 


ERIC 


518 


Por  alternatlvo  values  of  6,  it  vas  determined  how  mny  schools 
violated  the  inequality  and  what  their  characteristics  v-re  in  terns  of  the 
**I;}*fu?"**"*"  ^^^^      eligible  students,  the  nuaber  found  to  be 

eligible,  and  possibly  other  factors.    The  value  of  6  to  be  used  was  then 
chosen  by  judgaent  to  provide  negligible  bias  while  substantially  reducing 
variance.    The  chosen  value  of  6  was  10/M,  which  resulted  in  a  triaminR  of 
the  weights  for  schools  as  follows: 


Nuaber  of  schools 

„  Spiral  Tape  Assessnents,  by  Booklet 

Grade/Age 

4/9 
8/13 
11/17 


Assessment 

64 

65 

66 

67 

11 

0 

0 

0 

1 

3 

0 

0 

1 

0 

2 

1 

1 

2 

0 

Since  the  nuaber  of  schools  assigned  a  spiral  ass^ssaent  was  580  for 
Grade  4/Age  9,  453  for  Crade  8/Age  13  and  312  for  Grade  11/Age  17,  the 
percents  of  schools  whose  weights  were  triitaed  were  1.9  percent,  .6 
percerit,  and  .6  percent,  respectively.    The  corresponding  nuabers  of 
schools  assigned  at  least  one  tape  session  were  251  for  Grade  4/Age  9,  205 
for  Grade  8/Age  13  and  205  for  Grade  11/Age  17  (a  school  could  be  assigned 
one  or  aore  spiral  and  one  or  aore  tape  sessions).    The  percents  of  schools 
whose  weights  for  the  paced  adainistrations  (coabined)  were  triafted  were 
.4  percent,  .5  percent,  and  2  percent,  respectively. 


13.1.9  Post-atratification 

As  in  aost  saaple  surveys  with  cluster  saapling,  the  suas  of  respondent 
weights  ace  randoa  variables  which  are  subject  to  saapling  variability. 
Even  if  there  were  no  nonresponse,  the  suas  of  the  respondent  weights  would 
•t  best  provide  unbiased  estiaates  of  various  subgroup  totals.  However, 
since  unbiasness  refers  to  average  perforaance  over  the  possible 
replications  of  the  saapling,  it  is  unlikely  that  any  given  «istiaate,  based 
on  the  achieved  saaple,  will  exactly  equal  the  population  value. 
Purtheraore,  the  respondent  weights  have  been  adjusted  for  nonresponse  and 
a  nuaber  of  extreae  weights  have  been  reduced. 

To  reduce  the  aean  squared  error  of  the  saaple  estimates,  these  weights 
were  further  adjusted  so  that  estiaated  population  totals  for  a  nuaber  of 
specified  subgroups  of  the  population,  based  on  the  sua  of  weights  of 
students  of  th*  specified  subgroup,  were  the  sane  as  presumably  better 
estiaates  derived  fron  other  sources.    The  details  of  this  adjustnent, 
which  is  called  post-stratification,  appear  below. 


501 


519 


Post-Stratification  replaced  the  "weight  smoothing"  that  was  done  in 
the  prior  NAEP  assessments  and  has  the  purpose  (as  did  weight  smoothing)  of 
reducing  the  mean  squared  error  of  the  estimated  averages  or  proportions 
re'^ating  to  student  subpopulations  that  span  several  subgroups  of  the  whole 
population.    The  post-stratification  was  done  separately  for  the  spiral 
sessions  and  each  of  the  four  tape  sessions  within  each  grade/age  group, 
because  each  of  these  can  be  viewed  as  separate  samples  of  the  appropriate 
population* 

For  the  spiral  assessment,  thirteen  subgroups  were  defined  ii  terms  of 
race,  ethnicity,  census  region,  ana  community  size  (SDOC)  as  shown  in  Table 
13.1(1).    Each  of  the  thirteen  subgroups  was  further  divided  into  three 
classes: 

(a)  students  eligible  by  both  age  and  grade; 

(b)  students  eligible  by  age  only; 

(c)  students  eligible  by  grade  only. 

This  resulted  in  39  post-stratification  cells  for  each  age  class.  The 
final  weight  for  a  student  is  the  product  of  the  base  weight  (as  adjusted 
for  nonresponse  and  after  "trimming")  and  a  post-stratification  factor 
whose  denominator  is  the  sum  of  those  weights  for  the  cell  to  which  the 
student  belongs  and  whose  numerator  is  an  adjusted  estimate,  based  on  more 
reliable  data,  of  the  total  number  of  students  in  the  cell. 

The  adjusted  estimate  of  the  total  number  of  students  in  a  given  cell 
is  a  composite  of  estimates  from  the  Year  15  NAEP  sample  and  independent 
estimates  based  on  projections  based  on  Current  Population  Survey  data  and 
1980  Census  data.    The  adjusted  estimate  is  a  weighted  mean  of  the  two 
estimates,  the  weights  being  inversely  proportional  to  the  approximate 
variances  of  the  MAEP  and  independent  estimates.    (Further  details  are 
provided  in  the  Report  on  Sample  Selection.) 

The  sample  of  students  in  each  of  the  tape  assessments  was  much  smaller 
than  the  sample  for  the  spiral  assessments.    Consequently,  some  subgroups 
were  collapsed  for  post-stratification  as  follows: 


1,  2  6,  7 

3  8,  9 

4  10,  11,  12 

5  13 


Furthermore,  there  was  no  subdivision  into  eligibility  classes,  so  that 
there  were  eight  post-stratification  cells  for  each  age  class.  The 
numerators  of  the  post-stratification  factors  for  these  cells  were  the 
corresponding  adjusted  estimates  used  for  computing  the  spiral 
post-stratification  factor.    For  each  of  the  four  tape  assessments,  the 
denominators  were  the  sums  of  the  weights  for  each  age  class. 


502 


ERiC  520 


Table  13.1(1) 
Major  Subgroups  for  Post-Stratification 


Subgroup 

Race 

Ethnicity 

Region 

SDOC* 

1 

White 

Non-Hispanic 

NE 

1,  2 

2 

White 

Non-Hispanic 

NE 

3,  4,  5 

3 

White 

Non-Hispanic 

SE,  Central 

1,  2 

4 

White 

Non-Hispanic 

SE,  Central 

3 

5 

White 

Non-Hispanic 

SE,  Central 

4,  5 

6 

White 

Non-Hispanic 

West 

1,  2 

7 

White 

Non-Hispanic 

Vest 

3,  4,  5 

8 

Any 

Hispanic 

NE,  SE,  Central 

Any 

9 

Any 

Hispanic 

West 

Any 

10 

Black 

Non-Hispanic 

NE 

Any 

11 

Black 

Non-Hispanic 

SE 

Any 

12 

Black 

Non-Hispanic 

Central,  West 

Any 

13 

Other 

Non-Hispanic 

Any 

Any 

*SDOC  (F'^mple  Description  of  Community)  categories:  1 — Big  City; 

2 — Fringe  of  Big  City;  3~Medium  City;  4— Small  Place;  and  5~Extreme 

Rural. 


FRir 


503 


521 


13.1.10    The  Final  Student  Weight;    The  Full-Sample  Weight 


The  final  weight  assigned  to  a  student  is  the  student  full-sample 
weight.    This  weight  is  the  student's  base  weight  after  the  application  of 
the  various  adjustments  described  above  in  Sections  13.1.2  through  13.1.9. 

The  student  full-sample  weight  was  used  to  derive  all  estimates  of 
population  and  subpopulation  characteristics  which  have  been  presented  in 
the  various  NAEP  reports,  including  simple  estimates  such  as  the  proportion 
of  students  of  a  specified  type  who  would  respond  in  a  certain  way  to  an 
exercise  and  more  complex  estimates  such  as  mean  proficiency  levels. 

The  estimation  of  the  variability  of  these  estimates,  however,  involves 
the  use  of  another  set  of  weights,  in  fact,  32  other  weights  in  all.  These 
weights  are  closely  related  to  the  student  full-sample  weight,  but  differ 
in  a  manner  which  greatly  facilitates  the  estimation  of  sampling 
variability  by  the  jackknife  variance  estination  technique.    These  weights 
and  the  jackknife  estimator  are  discussed  in  the  next  chapter. 


504 


^  522 


Chapter  13*2 

BSTIMATKW  OF  UNCERTAINTT  DUB  TO  SAMPLING  VARIABILITT 


Eugene  G.  Johnson 
Educational  Testing  Service 


A  major  source  of  uncertainty  in  the  estimation  of  the  value  in  the 
population  of  a  variable  of  interest  exists  because  information  about  the 
variable  is  obtained  only  on  a  sample  from  the  population.    To  reflect  this 
fact,  it  is  important  to  attach  to  any  statistic  (e.g.,  a  mean)  an  estimate 
of  the  sampling  variability  to  be  expected  for  that  statistic.  (The 
estimation  of  variability  due  to  imperfect  measurement,  discussed  in 
Chapter  13.3,  is  also  essential). 

Estimates  of  sampling  variability  are  designed  to  provide  information 
about  hov  much  the  value  of  a  given  statistic  would  be  likely  to  change  if 
the  statistic  had  been  based  on  another ,  equivalent,  sample  of  individuals 
drawn  in  exactly  the  same  manner  as  the  achieved  sample.    Consequently,  the 
estimation  of  the  sampling  variability  of  any  statistic  must  take  into 
account  the  design  of  the  sample. 

The  NAEP  sample  is  obtained  via  a  stratified  multi-stage  probability 
sampling  design  which  includes  provisions  for  sampling  certain 
subpopulations  at  higher  rates.  Additional  characteristics  of  the  sample 
include  adjustments  of  the  weights  for  both  nonresponse  and 
post-stratification.    The  resulting  sample  has  very  different  statistical 
characteristics  from  those  of  a  simple  random  sample.    In  particular, 
because  of  the  effects  of  cluster  selections  (PSUs  and  schools  within  PSUs) 
and  because  of  effects  of  nonresponse  and  post-stratification  adjustments, 
observations  made  on  different  students  cannot  be  assumed  to  be  independent 
of  each  other  (and  are,  in  fact,  generally  positively  correlated). 

Treatment  of  the  data  as  a  simple  random  sample,  with  disregard  for 
the  special  characteristics  of  the  NAEP  sample  design,  will  tend  to  produce 
underestimates  of  the  true  sampling  variability. 


13.2.1    Linear  and  Nonlinear  Estimators 

The  statistics  which  are  obtainable  from  a  sample  can  be  grouped  into 
two  major  types:    linear  and  nonlinear.    A  linear  statistic  can  always  be 
represented  as  a  sum  of  the  form  la.X^  where  the       are  linear  combinations 
of  the  observations  and  the  a^  are  fixed  constants.    A  nonlinear  statistic 
is  anything  else. 


505 


523 


For  definiteness  in  what  follows,  let  t(y,  w)  be  any  statistic  which 
is  a  function  of  the  sample  responst^s  y  and  the  weights  v  (both  vectors). 
The  statistic  t  provides  an  estimate  of  some  population  value  of  interest 


T,    Because  of  the  adjustments  for  nonresponse  and  the  adjustments  from 
post-stratification,  the  adjusted  weights  are  random  variables  and 
consequently  aggregate  estimators  of  the  EW^Y^  are  nonlinear  estimators. 
Moreover,  even  if  the  weights  were  not  adjusted,  estimates  of  ratios  of  the 
form  EW  Y  /EW  X.  are  nonlinear,  as  are  the  more  complex  estimators  based  on 
item  response  theory.    The  nonlinearity  of  these  estimators  complicates  the 
evaluation  of  their  sampling  variability. 

The  sampling  variability  of  the  nonlinear  estimates  from  the  NAEP  data 
is  estimated  by  a  jackknife  procedure.    The  particular  jackknife 
methodology  used  will  be  detailed  below.    For  an  explanation  of  the  concept 
of  the  jackknife  see  Hosteller  and  Tukey  (1969). 

A  property  of  jackknife  methodology  is  that,  when  properly  applied  to 
the  same  data,  a  jackknife  estimate  of  the  variability  of  a  linear 
estimator  will  produce  the  same  result  as  the  standard  textbook  variance 
estimate.    Additionally,  the  jackknife  estimator  is  a  continuous  function 
of  a  nonlinear  estimator.    Because  of  these  properties,  approximate 
characteristics  of  the  jackknife  estimator  in  the  nonlinear  situation  (to  a 
first-order  degree  of  approximation)  can  be  inferrecJ  from  the 
characteristics  in  the  linear  situation. 

13.2.2    Accounting  for  the  Effects  of  Clustering,  Stratification 
and  Systematic  Selection 

Because  the  NAEP  respondents  are  obtained  by  multi-stage  cluster 
sampling,  the  variance  of  any  estimate  t  is  composed  of  components  of 
variability  due  to  each  of  the  stages  of  selection.    Furthermore,  this 
variance  should  account  for  the  fact  that  the  selection  of  the  units  within 
PSUs  at  each  stage  is  by  systematic  sampling  and  (except  foi.  the  last 
stage)  with  probabilities  proportional  to  measures  of  size.  Appropriate 
estimation  of  the  sampling  variability  of  an  estimate  is  aided  by  the 
remarkable  and  convenient  fact  that  variance  estimates  based  on  the 
differences  between  PSU  estimates  also  appropriately  reflect  the 
variability  within  PSUs,  no  matter  how  the  subsampling  was  done,  as  long  as 
the  subsample  taken  within  each  PSU  is  a  probability  sample  and  does  not 
depend  on  the  subsample  taken  in  another  PSU.    (For  a  discussion  see 
Wolter,  1985,  Section  2.4.5.;  Hansen,  Hurwitz,  &  Madow,  1953,  p.  258.) 

Estimation  of  the  sampling  variability  of  a  statistic  t  thus  comes 
down  to  the  estimation  of  the  variance  between  PSUs  (within  strata)  of  the 
sample  estimates  for  these  PSUs.    Appropriate  estimation  by  this  approach 
will  reflect  approximately  the  combined  effect  of  the  between-  and 
within-PSU  contributions  to  variance.    The  sample  of  PSUs  was  obtained  by 
sampling  with  inclusion  probability  proportional  to  an  adjusted  measure  of 
size  without  replacement,  using  the  algorithm  developed  by  Chromy  (1979). 
Since  the  selection  was  based  on  geographically  ordered  lists  of  PSUs 

506 


O  524 

ERLC 


within  19  sampling  strata  based  on  region  by  size  and  type  of  community, 
this  produced  a  sample  with  a  reasonable  geographic  representation.  For 
the  purposes  of  variance  estimation,  we  have  followed  the  common  practice 
of  pairing  the  PSUs  in  a  manner  consistent  with  the  sample  design  and  then 
regarding  each  pair  as  the  members  of  a  pseudo-stratum  for  variance 
est  illation  purposes.    This  results  in  a  set  of  PSU  pairs  where  the  PSUs 
within  a  pair  are  nearly  always  both  from  the  same  stratum  and  tend  to  be 
geographically  close  to  each  other.    Since  there  are  64  PSUs  in  total,  this 
results  in  32  pairs. 


13. 3-3    Estimation  of  Variability  of  Any  Statistic  by  the  Jackknife 

We  now  turn  to  the  general  procedure  used  by  ETS  to  estimate  the 
sampling  variability  of  any  statistic  t  (x,  w)  which  is  a  function  of 
sample  values  ^  and  weights  w.    As  noted  above,  this  is  done  by  a  jackknife 
procedure. 

As  was  commented  in  the  last  section,  for  the  estimation  of  the 
sampling  variability,  it  is  sufficient  to  restrict  one's  attention  to  the 
estimation  of  variability  of  the  sample  estimates  for  each  pair  of  PSUs  in 
the  saip?le.    The  jackknife  method  estimates  the  sampling  variability  of  any 
statistic  as  the  sum  of  components  of  variability  which  may  be  attributed 
to  each  of  the  PSU  pairs.    The  variance  attributed  to  a  particular  PSU  pair 
is  measured  by  estimating  how  much  the  value  of  the  statistic  would  change 
if  the  information  embodied  in  the  PSU  pair  were  to  be  changed. 

This  is  done  by  the  computation  of  a  quantity  t    called  a  pseudo- 
replicate,  which  is  associated  with  the  i     PSU  pair,  and  which  is  an 
estimate  of  the  statistic  of  interest  t  based  on  an  altered  sample. 
Specifically,  the  i     pseudo-ri plicate  of  the  statistic  t  is  created  by 
eliminating  the  data  from  the  first  PSU  of  the  pair,  replacing  the  lost 
information  with  that  from  the  second  PSU  of  the  pair  (so  that  the  second 
PSU  is  included  twice),  and  then  re-estimating  the  statistic  based  on  this 
altered  set  of  data. 

The  jackknife  estimate  of  the  variability  of  the  statistic  t  used  by 
ETS  is  the  sum  of  the  squared  differences  between  each  pseudo-repli'^ate  and 
the  overall  value; 

v;r  (t)  =  l^(t^  -  t)^ 

where  M  =  32  is  the  number  of  PSU  pairs. 

In  practical  terms,  the  major  expenditure  of  resources  in  the 
computation  of  a  jackknife  variance  estimate  occurs  in  the  construction  of 
the  pseudo-replicates.    The  method  used  by  NAEP  is  detailed  below.  This 
method  is  applicable  to  the  estimation  of  a  wide  range  of  statistics,  is 
straightforward  in  its  implementation,  and,  because  adjustments  were 
carried  through  separately  for  each  replicate,  approximately  accounts  for 
sources  of  variability  due  to  nonresponse  adjustment  and 


507 

525 


post-stratification.  Implementation  of  this  method  requires  32 
re-computations  of  the  statistic  of  interest. 


Specifically,  let  t  =  t(^,  y)  represent  the  value  of  the  statistic  of 
interest  when  it  is  computed  on  the  full  sample  using  the  full-sample 
weights,    (The  full-sample  weight  is  the  reciprocal  of  the  probability  of 
selection  of  the  student,  adjusted  for  nonresponse  and 
post-stratification. ) 

The  computation  of  pseudo-replicates  of  any  such  statistic  involves  the 
use  of  32  sets  of  weights,  which  we  shall  refer  to  as  JKWTOl  through 
JKWT32,    The  set  of  weights  JKWTi  is  identified  with  the  i      PSU  pair  and 
is  used  to  compute  t^,  the  i     pseudo-replicate  of  the  statistic  t.  The 
value  of  this  pseudo-replicate  is 

t    =  t(y,.IKWTi) 

which  is  simply  the  statistic  t  re-ccmputed  by  using  the  weights  JKWTi 
instead  of  the  full-sample  weights  (W). 

The  set  of  weights  for  the  i***  PSU  pair,  JKWTi,  are  computed  as 
follows: 

(1)  Let  W^  be  the  base  weight  for  student  j.    The  base  weight  is 
the  reciprocal  of  the  student's  overall  probability  of 
selection  and  is  not  adjusted  by  post-stratification  or 
adjusted  for  nonresponse. 

(2)  Let 

)  if  student  j  is  in  the 

first  PSU  of  PSU  pair  i, 

Bi       _    I    ry^B  if  student  j  is  in  the 

*j         "    '  second  PSU  of  PSU  pair  i, 

if  student  j  is  not  in 
^  either  PSU  of  PSU  pair  i, 

This  set  of  pseudo-replicate  base  weights  effects  the 
elimination  of  the  first  PSU  of  the  pair  and  replaces  it  in 
the  sample  with  the  second  PSU  of  the  pair. 

(3)  Adjust  the  set  pseudo-replicate  base  weights  produced  by 
Step  2  for  nonresponse  and  post-stratification  by  treating 
them  as  if  they  were  base  weights  for  the  sample.  These 
adjustments  take  into  account  the  grade/age  of  the  student  and 
the  mode  of  administration.    The  result  is  JKWTi. 


508 


Because  JKWTi  is  the  set  of  pseudo-replicate  base  weights  after 
adjustment  for  nonresponse  and  post-stratif ication^  the  effects  of  those 
adjustments  on  the  value  of  the  statistic  t  are  approximately  accounted 
for  in  the  estimate  of  the  variance  of  t  attributed  to  the  i     PSU  pair. 

As  an  example^  details  for  the  computation  of  the  jackknife  variance  of 
a  weighted  mean  follow: 

Let  Z    be  the  value  of  some  measurement  of  interest 
for  student  k  and  let  V    be  that  student's  full-sample 
weight*    The  statistic  of  interest  is  the  weighted  mean 
value  of  Z; 

n  n 
t  =    E     W  Z  /    E  W 

k   k  k 
k«l  k"»l 

Note  that  if       can  only  take  values  0  or  1>  then 
t  is  the  weighted  proportion  receiving  a  valae  of  1. 

Let       =  value  of  JKWTi  for  student  k.    The  pseudo- 
replicate  tor  the  i^    PSU  pair  is 

n  n 

t^      =    E    W^Z  /      E  , 
1  k  k  k  ' 

k«l  ksl 

the  jackknife  variance  of  the  weighted  mean  is 

32 

Var(t)  =  ^E^    (t^  .  t)' 

and  the  jackknife  standard  error  of  the  mean  is  the 
square  root. 


13.2.4    The  Degrees  of  Freedom  of  the  Variance  Estimate 

It  is  important  to  have  an  indication  of  the  number  of  degrees  of 
freedom  to  attribute  to  the  jackknife  variance  estimator  Var(t).  The 
degrees  of  freedom  of  a  variance  estimator  provide  information  on  the 
stability  of  that  estimator;    the  higher  the  number  of  degrees  of  freedom > 
the  lower  the  variability  of  the  estimator.    In  practical  terms >  the  number 
of  degrees  of  freedom  of  the  variance  estimator  corresponds  to  the  number 
of  residual  degrees  of  freedom  that  can  be  assumed  for  inferential 
procedures. 

Note  that  the  jackknife  procedure  estimates  the  sampling  variability  of 
the  statistic  by  assessing  the  effect  of  change  in  the  sample  at  the  paired 
PSU  level.    For  this  reason >  the  number  of  degrees  of  freedom  of  the 
variance  estimator  Var(t)  will  be  at  most  equal  to  the  number  of  PSU 


509 


527 


pairs.    The  number  of  degrees  of  freedom  equals  the  number  of  independent 
pieces  of  information  used  to  generate  the  variance.    In  the  current  case, 
the  pieces  of  information  are  the  32  squared  differences  (t^  -  t)    ,  each 
supplying  at  most  one  degree  of  freedom  (regardless  of  how  many  individuals 
were  sampled  within  any  PSU). 

Increasing  the  number  of  individuals  sampled  within  any  PSU  results  in 
a  lower  estimate  of  sampling  variability  because  the  within-PSU  component 
is  reduced.    This,  however,  does  not  improve  the  estimation  of  the 
between-PSU  component  of  variability,  which  depends  on  the  number  of  PSUs 
selected.    (It  does  slightly  reduce  the  overall  error,  however.) 

The  number  of  degrees  of  freedom  of  the  sample  variance  estimator  can 
be  strictly  less  than  the  number  of  PSU  pairs.    For  example,  suppose  that 
the  statistic  t  is  a  mean  for  some  subgroup  and  no  members  of  that  subgroup 
can  come  from  either  PSU  in  the  i*    PSU  pair.    (Examples  of  such  a 
subgroup  are  any  PSU-level  partitioning  of  the  population,  such  as  region.) 
If  the  pseudo-replicate  weights,  JKWTi,  had  not  been  adjusted  for 
post-stratification,  then  since  no  members  of  the  subgroup  come  from  either 
member  of  the  PSU  rair  i,  the  resulting  pseudo-replijate  t^  would  be 
identical  to  the  overall  estimate  t  so  that  (t.  -  t)    =0.    In  this  case, 
such  a  PSU  pair  imparts  no  information  about  the  variability  of  the 
statistic  t  and  thus  contributes  zero  degrees  of  freedom  to  the  variance. 

However,  it  is  generally  the  case  that  t^  does  not  equal  t,  even  when 
neither  member  of  the  PSU  pair  i  contains  observations  from  the  subgroup  in 
question.    This  is  because  the  pseudo-replicate  weights  have  been  adjusted 
for  post-stratification  without  regard  to  the  grouping  and  so  all  weights 
have  been  altered.    In  the  instance  that  neither  member  of  the  PSy  pair  i 
directly  contributes  to  the  estimate  of  t,  the  component  (t^  -  t)  is 
measuring  the  effect  of  post-stratification  on  the  estimate.    While  being 
nonzero,  such  a  component  is  likely  to  be  smaller  in  magnitude  than  the 
squared  difference  (t,^  -  t)    for  any  PSU  pair  k  which  does  contribute  to 
tlie  estimate  of  t. 

In  general,  the  squared  difference  (t^  -  t)^  will  be  estimating  the 
variance  component  a  ,  say,  which  is  the  contribution  to  the  sampling 
variance  of  the  statistic  t  which  can  be  attributed  to  the  samples  within 
the  i^*"  PSU  pair.  That  is,  Var(t)  is  estimating 


M 
Z 


If  a  few  of  the       are  markedly  larger  than  the  remainder,  as  in  the  above 
case,  then  Var(t)^is  predominantly  estimating  the  sum  of  these  larger 
components  which  dominate  the  remaining  terms.    The  effective  degrees  of 
freedom  of  Var(t)  in  this  case  should  be  nearer  to  the  number  of  dominant 
terms.    For  a  nonlinear  estimator,  the  relationship  of  the  number  of 


510 


ERLC 


528 


degrees  of  freedom  to  the  contribution  that  each  pair  makes  to  the  total 
estimate  of  variance  is  more  complicated. 

An  estimate  of  the  effective  number  of  degrees  of  freedom  for  Var(t) 
^^u'"^!."""       approximation  due  to  Satterthwaite  (1946)  which  assumes  that 
the  differences  t^  -  t  are  independent  and  approximately  normally 
distributed,  with  zero  means  but  possible  different  variances,  a  .  Hence 
the  squared  differences  are  each  distributed  like  a  chi-square  random 
variable  with  1  degree  of  freedom  times  a  constant,  The  Satterthwaite 

approximation  to  the  distribution  of  Vai(t)  comes  fro4i  equating  the 
expectation  and  variance  of  Var(t)  with  those  of  a  chi-squared  distribution 
(a  constant).    Specifically,  Var(t)  is  approximately  distributed  like  the 
constant 


M 

2 


E       a  times  a  chi-squared  rs^ndom  variable  with  df 

i-i       ^  •«« 

degrees  of  freedom,  where  df  is  the  effective  number  of  degrees  of 
freedom  of  Var(t)  defined  by* 


df 

•  it 


(  "  (t^    -  t)  ') 


2  v2 


1-1 


32 

which  is  never  larger  than  32.    (See  Cochran,  1977,  p.  96  for  further 
discussion* ) 

13.2.5    Alternative  Jackknife  Sstimators 

li  should  be  noted  that  there  are  a  varietv  of  alterna^-ive  jackknife 

?aoi?*^*'^  variance:  available  in  addition  to'^one  given  here  (see  Wolter, 
1985) . 

In  particular,  two  commonly  used  jackknife  estimators  are 


.2 


and 


1/2  (  E    (t    -  t)'  +    I    (t*    -  t)') 
1-1  i=i  ^ 


1/4  (  "  (t^  -  t*J  ) 


1-1 


1  1 


511 


529 


where  t*  is  an  analogous  pseudo-replicate  to  t  formed  by  eliminating  the 
second  ?h\J  of  the  pair  and  double  counting  the  first. 

In  the  case  of  a  linear  estimator,  all  of  these  methods  will  produce 
the  same  result.    Furthermore,  in  the  case  of  the  estimation  of  sampling 
variability  of  a  ratio  estimate  (such  as  a  weighted  mean),  Monte  Carlo 
experimentatioi.  based  on  the  Year  15  National  Assessment  of  Educational 
Progress  Design  indicated  trivial  differences  in  the  three  estimates  vsee 
Lago,  Burke,  Tepping,  &  Hansen,  1985).    The  ETS  estimator  Vdr(t)  requires 
half  the  computations  of  the  other  estimators,  at  apparently  minimal  loss 
(in  terms  of  variability  of  the  variance  estimator). 


512 


Z30 


Chapter  13.3 
BSTIHATION  OP  VARIABILITT  DUB  TO  IMPUTATION 


Robert  J.  Mislevy 
Educational  Testing  Service 


Potential  users  of  NAEP  data  should  be  avare  of  the  special  properties 
of  the  NAEP  database  that  affect  the  validity  of  conventional  techniques  of 
statistical  inference^  Because  of  the  specialized  methods  used  to  estimate 
reading  and  vriting  proficiencies  in  NAEP,  the  resulting  proficiency  values 
have  different  properties  from  ordinary  test  scores.  Therefore,  standard 
procedures  for  statistical  inference  should  not  be  applied  to  the  NAEP  data 
vithout  modification. 


13.3.1    Properties  of  NAEP  Data  That  Result  from 
Proficiency  Estimation  Procedures 

In  conventional  app7 '  at ions  of  item  response  theory  (IRT)  scaling,  the 
number  of  items  administered  to  each  respondent  is  sufficient  to  obtain  a 
reasonably  precise  estimate  of  each  individual's  proficiency.    In  NAEP, 
however,  the  goal  is  to  estimate  group  means,  rather  than  individual 
proficiency  values.    Some  re^ipondents  may  ansver  only  a  fev  question. 
Procedures  described  in  detail  belov  are  used  to  estimate  a  distribution  of 
plausible  values  for  each  respondent  and  to  draw  values  at  random  from  this 
distribution.    The  resulting  values  are  appropriate  for  calculating 
statistics  based  on  certain  groups,  but  do  not  represent  precise  estimates 
of  proficiency  for  individual  respondents. 

Use  of  this  method  of  estimating  proficiency  results  in  an  increase  in 
the  variability  of  statistics  such  as  means  and  regression  coefficients. 
Thus,  there  are  tvo  reasons  that  the  standard  errors  of  these  statistics 
are  larger  than  the  values  that  would  be  obtained  with  conventional 
formulas:    the  use  of  cluster  sampling,  which  results  in  non-independent 
observations;  and  the  use  of  proficiency  estimation  methodology  that 
provides  consistent  estimates  of  selected  group  characteristics,  but  does 
not  yield  precise  estimates  for  individual  respondents. 

Another  property  of  the  proficiency  estimates  based  on  plausible  values 
is  that  for  some  subgroups  of  respondents,  mean  proficiencies  may  be 
biased.    This  is  explained  in  Chapters  10.3  and  11.4. 


513 


531 


13,3*2    Using  Proficiency  Values  to  Estlpate  Variability 


Jackknifing  provides  a  reasonable  estimate  of  uncertainty  due  to 
sampling  from  a  finite  population  when  the  variable  of  interest  is  observed 
without  error  from  every  respondent.    As  noted  in  Chapter  10. 3,  however , 
some  of  the  key  reporting  variables  in  NAEP  are  not  observed  without  error. 
Although  both  reading  proficiency  and  writing  proficiency  are  construed  to 
characterize  individual  respondents ,  these  proficiencies  are  not  observed 
directly  from  any  respondent.    They  are  instead  inferred  imperfectly  from 
responses  to  a  few  reading  or  writirg  exercises. 

Bach  respondent  provides  answers  to  too  few  cognitive  exercises  to 
provide  an  accurate  point  estimate  of  his  or  her  ability.    However,  as 
described  in  Chapter  10. 3,  it  is  possible  to  summarize  what  is  known  about 
the  proficiency  value  9  of  respondent  i  given  his  or  her  responses  to 
cognitive  exercises  (x  )  and  background  variables  (y  )  in  terms  of 
a  probability  dlstribuhon  p(e|x  ,y  ).    For  computational  convenience, 
these  distributions  have  been  approximated  by  a  set  of  five  **plau8ible 
values**  9    through  8    t  drawn  at  random  from  p(0|x  .y  ).    They  are 
labeled  RDVALl  throufth  RDVALS  and  VRTVALl  through  VRTVAL5  on  the  user  tape. 
The  spread  of  these  plausible  values  reflects  the  uncertainty  about  the  9 
value  associated  with  that  respondent  given  the  observable  variables  x  and 
£1    The  tackground  variables  y  used  in  constructing  these  plausible  values 
are: 

(1)  age; 

(2)  grade; 

(3)  region  of  the  country; 
(/t)    parental  education; 

(5)  sex; 

(6)  ethnicity;  and 

(7)  size  and  type  of  community. 

Let  t(9yy)  be  a  statistic,  or  a  function  of  the  values  of  9  and  y  in 
the  sample,  estimating  a  population  value  T.    Examples  of  statistics  t 
would  be  weighted  means,  percentile  points,  and  regression  coefficients. 
If  9  were  observed  directly  for  sampled  pupils,  it  would  be  possible  to 
approximate  the  precision  of  t  through  standard  methods  for  survey  samples, 
such  as  the  jackknife  technique  described  above;  the  result  would  be,  say, 
VAr(t).    This  value  addresses  uncertainty  due  to  sampling  only.  Using 
plausible  values,  Ihe  additional  uncertainty  incurred  when  9  is  not 
observed  directly  can  be  managed  in  the  following  manner: 

(1)  Using  the  first  vector  of  plausible  values  for  respondent, 
RDVALl,  evaluate  t  as  if  the  plausible  values  were  the  true 

values  of  9.    Denote  the  result  t  . 


(2)  Using  the  multiple  weight  jackknife  approach,  compute  the 

estimated  sampling  variance  of  t  ,  or  Var(t  )  with  respecc  to 
respondents'  first  vectors  of  plausible  values. 


514 


532 


(3)  Carry  out  steps  (1)  and  (2)  for  the  second  through  fifth 

vectors  of  plausible  values,  thus  obtaining  t  and  Var(t  ) 
for  u  »  2, . . .,5.  «  « 


(4)  The  best  estinate  of  t  obtainable  from  the  plausible  values  is 
the  average  of  the  five  values  obtained  from  the  different 
sets  of  plausible  values: 


t    -  E  t  /5 
•     u  " 


(5)  An  estimate  of  the  variance  of  t.  is  the  sum  of  two 
components: 

Var(t.)  a  E  Var(t  )/5  +  E  (t    -  t  )^/5  . 
u  ^  u     ^  • 

The  first  component  in  Var(t.)  reflects  uncertainty  due  to 
sampling  respondents  from  the  population;  the  second  component 
reflecrs  uncertainty  due  to  the  fact  that  sampled  respondents' 
C's  are  not  known  precisely,  but  only  indirectly  through  x 
and  y^.  ^ 

The  first  component  in  Var(t.)  is  attainable  by  jackknife  methods  for 
means  as  described  in  the  preceding  section.    Jackknifing  could  also  be 
applied  to  more  complicated  statistics  such  as  regression  coefficients. 

Computations  in  this  manner  of  statistics  t,  involving  only  writing  or 
r^^ding  proficiency,  in  conjunction  with  the  specific  background  variables 
y  listed  above,  provides  nearly  unbiased  estimates  of  the  population  values 
T.    Statistics  involving  proficiency  and  background  variables  not  listed 
above  are  subject  to  biases,  the  magnitude  of  which  depend  in  part  on  the 
relationship  of  the  excluded  background  variables  to  the  included 
background  variables.    (See  Chapters  10.3  and  11.4  for  details.) 


13.3.3    Multiple  Runs  with  Different  Imputes 

Estimating  variability  requires  computing  a  statistic  165  tliutC; 
including  33  runs  to  obtain  an  estimate  and  a  variance  estimate  from  each 
of  the  five  sets  of  plausible  values.    Because  the  cost  of  the  full 
procedure  may  be  prohibitive  in  many  studies,  approximate  procedures  may  be 
used  to  produce  reasonable  estimates  at  lower  costs. 


515 


533 


One  method  of  reducing  costs  is  to  use  fewer  runs  on  plausible  value 
sets.    A  statistic  computed  from  a  sinprln  set  of  plausible  values  has  the 
same  expectation  as  the  average  of  the  f-ve,  but  does  not  take  into  account 
the  uncertainty  surrounding  6  values.    Use  of  at  least  two,  but  less  than 
five,  sets  of  plausible  values  to  evaluate  a  statistic  will  properly 
account  for  this  uncertainty  and  will  reduce  costs  at  the  same  time.  The 
occurrences  of  "S"  in  the  procedure  outlined  above  would  be  replaced  by  a 
"2",  ^'3",  or  "A"  as  appropriate.    The  resulting  decrease  in  computation  is 

accompanied  by  a  decrease  in  precision  for  estimating  Var(t.). 

A  second  cost-reducing  method  produces  estimates  of  sampling 
variability  that  are  more  accurate  than  those  obtained  by  design  effects 
(see  Chapter  14.2)  but  more  variable  than  those  obtained  by  jackknifing  all 
five  pseudo-datasets.    This  method  is  to  estimate  a  statistic  on  each 
pseudo-dataset  (in  order  to  estimate  variability  due  to  the  latency  of 
proficiency)  but  compute  its  jackknife  variance  on  only  one  pseudo-dataset 
to  estimate  sampling  variability.    This  procedure  was  used  by  NAEP  to 
produce  the  "almanacs"  of  estimated  effects  (see  Chapter  13.4  for  a 
di'jrjssion  of  almanacs). 

NOTE;    It  is  not  appropriate  to  average  the  five  plausible  values 
associated  with  each  respondent  and  analyze  those  averages.    The  ^sult  of 
such  a  computation  is  not  generally  equal  to  the  correct  value. 


516 


Chapter  13.4 
USB  OP  THE  NAEP  ALMANACS 


Rebecca  Zwick 
Educational  Testing  Service 


The  sets  of  tables  summarizing  NAEP  results  are  referred  to  as 
almanacs.    This  chapter  includes  (a)  an  annotated  table  listing  the  NAEP 
almanacs  for  Year  15  (1983-84)  and  for  reading  and  writing  ti  ids,  (b)  a 
description  of  the  format  in  which  information  is  presented  in  these 
almanacs,  and  (c)  some  cautionary  notes  concerning  the  interpretation  and 
analysis  of  weighted  means  and  percentages  in  NAEP. 


13.4.1    Table  of  NAEP  Almanacs 

Table  13.4(1)  lists  the  almanacs  for  NAEP  Year  15  and  for  the  reading 
and  writing  trends.    The  information  that  can  be  found  in  these  almanacs 
includes  item  percents-correct  and  mean  proficiency  values  for  reading  and 
writing,  as  well  as  responses  to  background  and  attitude  items.  Almanacs 
that  were  created  for  special  studies,  such  as  the  BIB/pace  bridge  study 
are  not  included  in  this  table.    Note  that,  except  where  specified,  these 
almanacs  include  data  for  the  BIB  sample  only  (see  Chapter  5).    The  Year  15 
almanacs  correspond  to  the  three  grade  samples  (4,  8,  and  11).    "Age  only" 
students,  that  is,  those  who  were  age  9,  but  not  in  grade  4,  age  13,  but 
not  in  grade  8,  or  age  17,  but  not  in  grade  11  were  excluded.    The  trena 
almanacs  correspond  to  the  three  age  samples  (9,  13  and  17).  "Gr?.de-only" 
students  (i.e.,  those  in  grades  4,  8,  or  11,  but  not  in  the  designated  age 
groups)  were  excluded.    (See  Chapter  4  for  a  further  description  of  age  and 
grade  samples.) 


13*4*2    Format  of  Information  Contained  in  NAEP  Almanacs 

In  the  sections  below,  the  format  of  information  in  the  each  of  the 
types  of  NAEP  almanacs  listed  in  the  table  is  described. 


13.4.2.1    Type  1;    Background  and  Attitude  Items 

Each  almanac  page  corresponds  to  a  particular  background  or  attitude 
item,  such  as,  "How  far  in  school  did  your  father  go?"    The  possible 
responses  to  the  item  are  listed  across  the  top  of  the  page.    Along  the 
left-hand  side  of  the  page  is  a  list  of  socio-demographic  groups,  such  as 


517 


535 


Table  13.4(1) 
NAEP  Year  15  Almanacs  -  Dates  of  Issue  and  Comments 


Type  of  Almanac 


Grade  4 


Grade  8 


Grade  11 


2. 


3. 


Background  and 
Attitude  Items 

Reading  and 
Writing  Items 

Background  and 
Attitude  Items 
with  Reading 
Proficiencies 

Reading  and  Writing 
Items  with  Reading 
Proficiencies 

Background  and 
Attitude  Items 
with  Writing 
Proficiencies 


Type  of  Almanac 


10/25/84*  6/11/85 


3/27/86 


8/19/85 


8/19/85 


12/23/85 


3/27/86 


8/1/85 


8.1/85 


12/23/85 


NAEP  Trend  Almanacs 


Age  9 


Age  13 


11/16/84* 

3/27/86 

8/23/85 

8/23/85 
1/6/86 


Age  17 


6. 


8. 


Reading  Trend:  Mean 
Proficiencies  and 
Percent  at  or  above 
anchor  points  (Years 
2,  6,  11,  15;  BIB 
and  pace  samples  merged) 

Reading  Trend:  Percent 
correct  for  items 
common  to  Years  2,  6, 
11,  15  (BIB  and 
pace  separately) 

Writing  Trend:  Primary 
trait  and  holistic 
scores  for  1  item 
common  to  Years  5, 
10,  and  15  and  for 
2  items  common  to 
Years  10  and  15 
(pace  sample)* 


8/7/85 


8/7/85 


7/10/85 
13  items 


2/12/86 


7/10/85 
22  items 


2/12/86 


8/7/85 


7/10/85 
19  items 


4/3/86 


*    Standard  errors  are  incorrectly  estimated;  see  Section  13.4.3. 


518 


ERLC 


536 


male,  female,  white,  black,  or  Hispanic.  Each  line  of  results  in  the 
almanac  contains  five  kinds  of  information: 

*(a)  the  actual  sample  size,  N,  for  the  group  to  which  that  line 
applies, 

(b)  the  weighted  N,  which  is  the  sum  of  the  sampling  weights  for 
the  group, 

(c)  a  measure  of  variability  of  the  weighted       which  is  enclosed 
in  parentheses  following  the  weighted  N, 

(d)  the  weighted  percentages  of  group  members  who  ^ave  the 
responses  listed  horizontally  across  the  page,  and 

(e)  the  standard  errors  of  the  weighted  percentages,  which  are 
enclosed  in  parentheses  following  the  percentages  to  which 
they  apply. 


13.4.2.2    Type  2;    Reading  and  Writing  Items 

These  almanacs  have  the  same  format  as  the  Type  1  almanacs,  above.  The 
only  difference  is  that  the  headings  listed  across  the  top  of  the  page 
represent  responses  to  reading  and  writing  items.    In  the  case  of  reading 
items,  the  correct  response  choice  is  indicated  by  an  asterisk. 


13.4.2.3    Type  3i    Background  and  Attitude  Items  with  Reading  Proficiencies 

These  almanacs  include  the  same  type  of  information  contained  in  the 
almanacs  of  Type  1,  as  well  as  an  additional  line  of  information  for  each 
reporting  group.    This  extra  line  shows  the  weighted  mean  reading 
proficiency  (based  on  the  first  plausible  value  only;  see  Chapter  10.3)  for 
students  who  gave  each  of  the  possible  responses  to  the  background  or 
attitude  item  on  that  page.    The  standard  error  of  the  mean  is  given  in 
parentheses  following  the  mean.    (As  outlined  in  Section  13.4.3,  the 
standard  error  includes  a  conponent  due  to  sampling  variability  and  a 
component  due  to  imprecision  of  measurement.) 

Note  that  the  sample  of  observations  on  which  these  almanacs  is  based 
is  not  identical  to  that  used  for  the  Type  1  almanacs.  This  is  because  a 


For  almanacs  issued  prior  to  November  1984,  this  measure  was  the 
standard  error  of  the  weighted  N.    For  almanacs  issued  between  November 
1984  and  January  10,  1985,  the  measure  given  is  the  coefficient  of 
variation  (C.V.),  which  is  equal  to  the  standard  error  of  the  weighted  N, 
divided  by  the  weighted  N.    For  almanacs  issued  January  10,  1985,  or  later, 
the  measure  given  is  the  rescaled  coefficient  of  variation,  C.V.*  =  100  x 
C.V.    The  importance  of  C.V.*  is  described  in  Section  13.4.3. 


519 

537 


subset  of  students  have  complete  background  and  attitude  data,  but  do  not 
have  proficiency  estimates  (see  Chapter  10,2,  on  LOGIST). 

13.4.2.4  Type  4;    Reading  and  Writing  Items  with  Reading  Proficiencies 

These  almanacs  are  of  the  same  format  as  those  of  Type  3,  except  that 
the  headings  listed  across  the  top  of  the  age  represent  responses  to 
reading  and  writing  items. 

13.4.2.5  Type  5;    Background  and  Attitude  Items  with  Writing  Proficiencies 

These  almanacs  are  of  the  same  format  as  those  of  Type  3,  except  that 
weighted  mean  writing  proficiencies  (based  on  all  five  plausible  values) 
rather  than  reading  proficiencies  are  given. 

13.4.2.6  Type  6;    Reading  Trend:  Mean  Proficiencies 

The  first  page  of  the  trend  tables  show  weighted  reading  proficiency 
means  (based  on  all  five  plausible  values)  and  standard  errors  for  each  of 
the  reporting  groups  listed  along  the  left  side  of  the  page,  for  each  of 
the  assessment  yeais  (2:    1970-71,  6:    1974-75,  11:    1979-80,  and  15: 
1983-84)  listed  across  the  top  of  the  page.    Subsequent  pages  give  the 
unweighted  number  and  weighted  percentage  of  students  in  each  cell  of  the 
first  table,  and  the  weighted  percentages  of  students  with  reading 
proficiencies  at  or  above  each  of  the  behavioral  anchoring  points  (see 
Chapter  10.5,  on  scale  definition  and  behavioral  anchoring).    The  Year  15 
data  in  this  almanac  are  based  on  the  combined  BIB  and  pace  samples. 


13.4.2.7    Type  7:    Reading  Trend:  Percents  Correct 

The  tirst  page  in  these  almanacs  has  essentially  the  same  format  as  the 
Type  6  almanacs.    Instead  of  a  mean  proficiency  value,  however,  these 
almanacs  give  the  weighted  mean  percent-correct,  averaged  across  the 
reading  items  common  to  the  four  assessments.    The  subsequent  pages  provide 
this  information  separately  for  each  item.    In  these  almanacs.  Year  15  data 
are  provided  separately  for  the  BIB  and  pace  samples. 


13.4.2.8    Type  8:    Writing  Trend 

Each  page  of  these  almanacs  corresponds  to  a  single  writing  item. 
Reporting  variables  and  assessment  years  are  listed  along  the  left  side  of 
the  page.    The  possible  score  categories  are  listed  along  the  top.  Each 
entry  in  the  main  body  of  the  table  gives  the  weighted  percentage  of 
students  in  a  particular  reporting  group  and  assessment  year  who  received 
the  specified  standard  errors  are  given.    Data  are  provided  for  one  item 
common  to  Years  5,  10,  and  15,  and  two  items  common  to  Years  10  and  15. 
Year  15  data  in  the  Type  8  almanacs  are  for  the  pace  sample  only. 

520 


538 


13.4.3    Interpretation  and  Analysis  of  Weighted  Means 
and  Percentages  in  NAEP 

Weighted  proficiency  means  and  weighted  percentages  of  students  giving 
a  particular  response  to  a  background  or  attitude  item  are  likely  to  be 
used  in  both  descriptive  and  inferential  analyses  of  the  NAEP  data.  In 
both  cases,  the  standard  errors  of  these  statistics  should,  of  course,  be 
considered.    As  described  in  Chapter  13,  the  standard  errors  of  mean 
proficiencies  are  larger  than  those  that  would  be  obtained  from 
conventional  formulas  for  two  reasons:    First,  the  use  of  cluster  sampling 
in  NAEP  results  in  non-independent  observations.    Second,  the  proficiency 
estimation  methodology  used  in  NAEP  provides  consistent  estimates  of 
selected  group-level  characteristics,  but  does  not  yield  precise  estimates 
for  individual  respondents.    In  the  case  of  weighted  percentages  of 
students,  only  the  first  reason  applies. 

In  some  cases,  the  standard  errors  themselves  are  poorly  estimated,  as 
reflected  by  large  values  of  C.V.*  =  100  x  C.V.    Westat  sampling 
statisticians  suggest  the  following  rule  of  thumb:    If  the  value  of  C.V.* 
for  a  particular  line  of  an  almanac  is  less  than  10,  it  can  be  assumed  that 
the  standard  errors  for  that  almanac  line  are  well  estimated;  if  C.V.*  is 
between  10  and  20,  the  adequacy  of  the  standard  error  estimates  is  in 
question;  if  is  greater  than  20,  the  standard  error  estimates  are 

unacceptable.    Hypothesis  tests  and  confidence  intervals  should  not  be 
computed  if  C.V.*  exceeds  20.    (In  some  NAEP  almanacs,  values  of  C.V.* 
greater  than  20  are  flagged.    Also^  as  noted  in  the  table,  the  standard 
errors  in  the  Type  1  (Background  and  Attitude)  almanacs  for  Grades  4  and 
11  were  incorrectly  estimated  because  a  replicate  weight  was  wrong  (see 
Chapter  13.2). 

Another  issue  that  must  be  considered  in  conducting  statistical 
analyses,  such  as  comparisons  of  means  or  percentages,  is  the  degrees  of 
freedom.    In  a  complex  sample,  the  degrees  of  freedom  are  a  function  of  the 
number  of  primary  sampling  units  (PSUs)  and  strata,  rather  than  the  number 
of  observations.    In  NAEP,  the  upper  bound  to  the  number  of  degrees  of 
freeoom  available  for  an  analysis  is  32,  the  number  of  PSD  pairs  (which  is 
equal  to  the  number  of  strata  minus  the  number  of  PSUs).    For  a  given 
comparison,  the  number  of  available  degrees  of  freedom  could  be  less 
because  only  a  subset  of  all  PSUs  is  involved.    Further  reductions  in 
degrees  of  freedom  may  result  from  inequalities  of  within-PSU  variability. 
Therefore,  in  order  to  avoid  Type  I  errors,  a  stringent  critical  value 
should  be  used  in  conducting  hypothesis  tests.    This  is  especially  true 
when  multiple  tests  are  to  be  performed. 


As  a  result  of  the  cluster  sampling  used  in  NAEP,  the  means  or 
percentages  for  two  groups  of  respondents  are  not,  in  general,  independent 
even  if  the  groups  do  not  contain  any  of  the  same  subjects;  instead,  they 
may  be  positively  correlated.    The  effect  of  this  dependency,  which  is 


521 

539 

ERIC 


Finally,  as  detailed  in  Chapters  10.3  and  11. A,  mean  reading  and 
writinK  proficiencies  may  be  biased  in  certain  subgroups.    If  the  grouping 
variable  is  one  of  those  used  in  the  conditioning  procedures  described  in 
the  above  chapters,  the  mean  proficiencies  will  be  virtually  unbiased.  If 
the  grouping  variable  was  not  used  in  the  conditioning,  the  degree  of  bias 
will  be  a  function  of  the  relation  between  the  grouping  variable  and  the 
conditioning  variables. 


typically  ignored,  is  to  reduce  slightly  the  likelihood  of  a  statistically 
significant  result.    However,  the  conservativeness  introduced  by  this 
dependency  is  likely  to  be  far  outweighed  by  the  increased  risk  of  Type  I 
error  that  can  result  from  the  performance  of  multiple  tests  and  the 
overestimation  of  the  degrees  of  freedom. 


522 


540 


Chapter  14 
SUPPLEMBNTART  STUDIES 

Albert  E.  Beaton 
Educational  Testing  Service 


In  addition  to  the  processes  used  to  develop  the  parameter  estimates 
included  in  the  NAEP  reports,  several  other  studies  have  been  completed 
which  lend  to  the  credibility  and  usefulness  of  the  NAEP  data.    Two  of 
these  studies  are  reported  here: 

*      The  validity  of  the  NAEP  Year  15  reading  and  writing 

assessments:    The  ETS  Standards  for  Quality  and  Fairness 
(1983)  require,  among  other  things,  that  each  testing  program 
provide  evidence  for  the  validity  of  the  test  scores  as 
related  to  their  purpose.    The  NAEP  assessment  is,  of  course, 
quite  different  from  other  ETS  programs  becau-je  the  data  are 
not  used  for— indeed  cannot  be  used  for—individual  decision- 
making.   However,  we  have  addressed  the  issue  of  validity  of 
the  measuring  instruments  in  this  assessment  and  the  results 
are  reported  in  Chapter  14.1, 

The  design  effects  of  the  NAEP  data.    Because  the  NAEP 
sampling  design  was  complex  and  used  various  natural 
clusterings  of  students,  the  usual  formulas  used  by  standard 
statistical  systems  for  estimating  error  variances  are 
strictly  inappropriate.    A  design  effect  is  an  estimate  of  the 
proportionate  increase  in  error  introduced  by  using  standard 
formulas  instead  of  the  jackknife  or  other  appropriate 
formula.    We  have  examined  the  design  effects  to  help  advise 
potential  secondary  users  of  NAEP  data  about  the  likely  error 
in  using  simpler  methods  and  to  offer  a  computationally  simple 
way  of  approximating  the  proper  error  estimates.    The  results 
of  this  study  are  reported  in  Chapter  14.2. 


523 

541 


chapter  14.1 

VALIDITY  ISSUES  IN  NAEP: 
TEAR  15  READING  AND  WRITING  ASSESSMENTS^ 

Rebecca  Zwick 
Educational  Testing  Service 


tu.  ^"  evaluating  the  adequacy  of  a  cognitive  or  psychological  instrument, 
the  most  fundamental  question  is  whether  it  can  provide  the  basis  for  valid 

o^irnrr.;?°"'il5?/'fP°!?^""'  characteristics  it  claims  to  measure,  'ie 
topic  of  test  validity  is  featured  prominently  in  the  recently  revised 
standards  for  Educational  and  Psychological  Testing  (1985),  produced  by  the 
American  Educational  kesearch  Association,  the  American  Psychological 
in  «c        'J'/?'*c*^^*'^!^°"*^  °"  Measurement  in  Education  (referred 

Pa^rnooo^HSi<    ^^tft""^^  ^^^""^  ^""^  ^"        ETS  Standards  for  Quality  and 

^^""ess  (iVbi),  which  serves  as  a  basis  for  periodic  reviews  of  ETS  

programs. 

Like  most  published  work  on  validity,  the  discussions  in  these 
standards  manuals  focus  on  testing  situations  in  which  the  goal  is  to  draw 
S???i;fn??v  fTv,  individual  respondents.    Validity  must  be%onceptuaHzeS 
of  n!pp  !  J®  °^  large-scale  assessments  such  as  NAEP.    The  goal 

ot  NAEP  is  to  make  inferences  about  groups,  rather  than  individuals.  In 

I  :  ,^''^u''^L^^°'^^  individuals  are  rendered  impossible  because  ETS  does 
not  retain  the  identity  of  respondents  to  NAEP  items.    In  addition,  the 
proficiency  estimation  procedures  used  in  NAEP  result  in  values  that  are 
appropriate  for  calculating  statistics  based  on  certain  groups  of  students, 
tlr  indiJidiafr^^"*  optimal  estimates  of  reading  and  writing  proficiency 

v^HH^r^nffr®'/^®  ''i"''f  °^  evidence  that  can  be  provided  in  support  of 
it^.A  J,  r  f"""™  information  that  is  typically  available  for  test 
validation.  It  is  not  generally  possible,  within  the  context  of  NAEP,  to 
collect  additional  data  from  NAEP  respondents  or  to  conduct  supplementary 
research  studies  for  purposes  of  investigating  validity  questions.  Also, 
because  precise  estimates  of  individual  proficiency  are  not  available  in 
NAEP,  estimates  of  correlations  involving  NAEP  reading  and  writing  skills 
cannot  be  obtained  in  a  straightforward  manner  (see  below>.  This  further 
?jrJorrI?M  validation  process,  which  typically  involves  examination  of 
the  correlations  between  the  measure  of  interest  and  other  variables. 


Jirele. 


I  The  correlational  analyses  for  this  section  were  performed  by  Tom 


525 


ERIC 


542 


These  limitations  do  not,  of  course,  exempt  NAEP  froin  the  obligation  of 
considering  validity  issues.    It  is  important  to  recognize,  however,  that 
it  is  not  alvays  desirable,  or,  indeed,  possible,  to  apply  conventional 
test  validation  procedures  in  NAEP. 

The  ETS  Standards  manual  lists  three  components  of  validity:  content, 
construct,  and  criterion-related  validity.    In  the  Joint  Standards  manual, 
these  three  components  are  described  instead  as  three  types  of  validity 
evidence*    This  reflects  the  view  of  Messick,  who  pointed  out  that 
"different  kinds  of  inferences  from  test  scores  require  different  kinds  of 
evidence,  not  different  kinds  of  validity"  (1980,  p.  1014). 

According  to  the  Joint  Standards,  content-related  evidence  demonstrates 
that  the  sample  of  items  on  a  test  are  representative  of  a  specified 
content  domain.    Because  NAEP  is  not  a  test,  but  a  survey  that  uses 
multiple  matrix  sampling,  the  items  received  by  a  given  student  can  not  be 
expected  to  be  a  representative  sample  of  the  subject  matter  domain  (e.g., 
reading  or  writing).    However,  it  is  possible  to  evaluate  the 
representativeness  of  the  total  item  pool  corresponding  to  each  domain  by 
applying  essentially  the  same  techniques  used  in  conventional  content 
validation. 

Cons true t-related  evidence  supports  the  use  of  the  test  score  as  a 
measure  of  the  characteristic  of  interest.    This  concept  must  be  revised 
slightly  in  the  case  of  achievement  surveys:    In  NAEP,  the  goal  of 
construct  validity  studies  is  to  determine  whether  the  mean  reading  and 
writing  scale  values  for  selected  sociodemcgraphic  groups  can  be 
interpreted  as  measures  of  reading  and  writing  proficiency. 
(Alternatively,  the  validity  of  interences  based  on  mean  item  scores,  i.e., 
the  proporr.ions  of  selected  groups  who  ans/ered  individual  items  correctly, 
could  be  considered.    Item-level  analyses  can  be  useful  for  research  that 
focuses  on  specific  skills;  see  Writing;    Trends  Across  the  Decade, 
1974-84  [Applebee,  Langer,  &  Mullis,  19B6a].    The  present  section  is 
primarily  concerned  with  inferences  about  overall  reading  and  writing 
proficiency  and  focuses,  therefore,  on  the  validity  of  conclusions  based  on 
item  composites. ) 

Criterion-related  evidence  shows  that  test  scores  are  related  to 
pertinent  outcome  criteria.    For  instance,  in  validating  a  measure  used  in 
employee  selection,  it  is  important  to  demonstrate  that  test  scores  are 
related  to  job  performance.  Because  NAEP  is  not  a  test  and  is  not  used  to 
make  inferences  about  individuals,  no  comparable  evidence  regarding  the 
NAEP  assessment  instruments  can  be  provided. 

In  the  following  sections,  content  and  construct  validity  issues  are 
discussed  with  reference  to  the  NAEP  reading  and  writing  assessment. 


14.1.1    Content  Validity 

The  NAEP  reading  and  writing  exercises  for  1983-1984  were  the  product 
of  an  elaborate  process  of  item  development,  described  in  detail  in  Chapter 


526 


ERLC 


3  and  in  two  NAEP  booklets,  Reading  Objectives:  1983-1984  Assessment  (1984) 
and  Writing  Objectives;  1983-1984  Assessment  (1982).    The  process  consisted 
of  four  phases:    (1)  development  and  review  of  educational  objectives  in 
reading  and  writing,  (2)  development  and  review  of  a  pool  of  items 
corresponding  to  these  objectives,  (3)  field  testing  of  prospective  items, 
and  (4)  final  review  and  item  selection.    External  consultants  were 
extensively  involved  in  developing  and  reviewing  objectives  and  in  writing 
and  reviewing  prospective  items.    In  the  case  of  the  writing  assessment, 
consultants  also  participated  in  the  development  and  review  of  procedures 
for  scoring  the  item  responses.  These  consultants  included  subject  area 
experts,  curriculum  specialists,  classroom  teachers,  school  administrators, 
and  parents.    Contiiuutors  were  chosen  to  represent  a  diversity  of  ethnic 
groups,  community  types,  and  geographic  regions.    The  participation  of 
these  consultants  was  achieved  through  a  series  of  conferences  and  mail 
reviews,  coordinated  by  the  NAEP  advisory  committees  for  reading  and 
writing. 

The  items  selected  in  the  final  phase  of  the  process  were  required  to 
reflect  the  educational  objectives  agreed  upon  in  the  first  phase  and  to  be 
consistent  with  established  principles  of  test  development.    Among  other 
criteria,  they  had  to  be  judged  free  from  apparent  bias  against  any 
sociodemographic  group. 

In  short,  a  great  deal  of  effort  was  expended  to  define  reading  and 
writing  domains  and  to  ensure  that  the  final  pool  of  NAEP  items  represented 
these  domains  adequately.    Although  all  these  NAEP  exercises  are  available 
for  item-level  analyses,  some  items  were  excluded  from  the  reading  and 
writing  proficiency  scales  because  of  practical  considerations  or,  in  the 
case  of  reading  items,  because  they  were  expected  to  produce  violations  of 
unidimensionality  assumptions.    Details  on  the  criteria  used  for  including 
items  in  the  proficiency  scales  are  given  in  Chapters  10.2  and  11.4.  The 
reading  scale  included  228  of  the  340  items  originally  designated  as 
reading  items;  the  writing  scale  included  10  of  22  writing  items.  The 
development  of  proficiency  scales  facilitated  the  summarization  of  reading 
and  writing  results  and  the  comparison  of  results  across  grades  and  across 
assessment  years.    A  drawback  of  the  scaling  is  that  the  included  items 
cannot  be  considered  fully  representative  of  the  domains  that  were 
initially  defined. 


14.1.2    Construct  Validity 


14.1.2.1    NAEP  Reading  and  Writing  Proficiency  Scales 

In  previous  assessments,  NAEP  results  consisted  of  only  the  responses 
to  individual  items.    In  contrast,  analysis  of  the  1983-1984  data  included 
the  development  of  reading  and  writing  proficiency  scales.    In  this 
chapter,  the  properties  of  the  NAEP  reading  and  writing  scales  that  are 
relevant  to  validity  assessment  are  considered.    Development  of  the  scales 
is  described  in  detail  in  Chapters  10  and  11. 


527 

544 


The  NAEP  reading  data,  which  consisted  of  dichotomous  item  responses, 
were  scaled  using  item  response  theory  (IRT)  methods.    Specifically,  the 
three-parameter  logistic  model  (Birnbaum,  1968)  was  applied.    Because  many 
students  had  received  only  a  small  number  of  items,  precise  estimates  of 
each  respondent's  proficiency  could  not  be  obtained.    Instead,  estimation 
procedures  that  would  produce  consistent  estimates  of  selected  group-level 
characteristics  were  used.    For  each  student,  a  proficiency  distribution 
was  estimated,  conditional  on  that  individual's  item  responses  and  on 
selected  demographic  characteristics.    Theoretically,  these  estimated 
distributions  could  be  used  to  compute  statistics  of  interest,  such  as 
subgroup  means,  via  integration.    Because  evaluation  of  the  required 
integrals  presents  computational  problems,  the  statistics  of  interest  can 
instead  be  estimated  by  making  use  of  "plausible  values"  selected  at  random 
from  each  respondent's  distribution.    For  each  respondent,  five  plausible 
values  were  drawn,  each  of  which  can  be  viewed  as  an  estimate  of  that 
student's  unknown  proficiency  value.    This  proficiency  estimation 
methodology  serves  the  goals  of  NAEP  in  that,  unlike  conventional  IRT 
methods,  it  provides  consistent  estimates  of  group  characteristics  when 
applied  to  the  sparse  data  available  for  individual  NAEP  respondents.  A 
drawback  is  that  the  plausible  values  are  not  optimal  estimates  of 
individual  proficiency. 

Development  of  the  writing  proficiency  scale  presented  an  additional 
challenges    unlike  the  reading  items,  which  were  dichotomously  scored,  the 
writing  items  were  scored  on  a  five-point  scale.    (Only  the  primary  trait 
scores  were  used  for  the  proficiency  scale;  see  Chapter  13.4.)  Although 
generalizations  of  IRT  methods  have  been  developed  for  rating  scale  data, 
application  of  these  scaling  techniques  to  the  NAEP  writing  data  did  not 
produce  satisfactory  results.    Therefore,  a  multiple  regression  approach, 
called  the  average  response  method  (ARM),  was  used  to  develop  the  writing 
proficiency  scale. 

Although  students  had  answered  different  subsets  of  the  ten  NAEP 
writing  items  selected  for  inclusion  in  the  scale,  it  was  possible,  because 
of  BIB  spiralling  (see  Chapter  4),  to  obtain  the  matrix  of  pairwise 
correlations  between  the  items.  Therefore,  for  each  respondent,  a  predicted 
mean  score  on  the  ten  items  could  be  derived,  conditional  on  that 
individual's  item  responses  and  on  selected  demographic  characteristics.  A 
writing  plausible  value,  analogous  to  the  reading  plausible  value  was  then 
obtained  by  adding  to  this  predicted  score  a  random  terra  representing  the 
uncertainty  of  the  respondent's  predicted  mean,  given  that  individual's 
demographic  characteristics  and  item  responses. 

Although  the  methods  of  scale  development  were  not  identical  for 
reading  and  writing,  both  methods  yield  so-called  plausible  values,  which 
are  not  optimal  estimates  of  individual  proficiency.  The  scale  construction 
methodology  has  important  implications  for  the  assessment  of  validity  in 
NAEP.    For  example,  as  a  result  of  the  scaling  approach,  sample  means  for 
certain  NAEP  subpopulations  are  biased.    Also,  correlation  coefficients 
based  on  plausible  values  yield  seriously  attenuated  estimates  of  the 
relations  between  NAEP  reading  proficiency,  writing  proficiency,  and  other 


528 


545 


variables  of  interest.  These  issues  are  addressed  in  the 
sections. 


Validity  Evidence  Based  on  Group  Differences  in  Mean  grofigi«.npy 

focused  its  energies  on  the  calculation  of  means  and  standard 
errors  for  subpopulations  that  are  of  primary  interest.    These  selected 
groups  are  based  on  the  following  demographic  variables:  grade  (4,  8.  or 
Id:  "I  J"*^'  f  i^^lj),  ethnicity  (white,  olack,  HispaSic,  or  other), 

T  °J  co""«nity  (advantaged  urban,  disadvantaged  urban,  or 
iduclHo^'Sr;  <"?'^^*'«"^'  southeast,  central,  or  west),  and  parental 
J.c«?iiS  JJi^i?     f'''"^'^^^*'"  irom  high  school, 

Altli^^  aI)*].1^  beyond  high  school,  or  unknown).    (See  Chapter  12  for 
detailed  definitions  of  these  variables.)    These  demographic 

^•;5?ni*li3  7m*"  "'T'^  "  conditioning  variables  for  developing  both  the 
reading  and  writing  scales.    (An  additional  variable,  grade/age  status,  was 

inclusion  in  the  conditioning,  estimates  of  proficiency  means  for 
subpopulations  based  on  these  variables  are  virtually  unbiased.    The  means 
:frJ?Jf  r*"'  ^/^"""'^  demographic  groujs  are  biased  to 

nl^iAl  fSf  «'«Pl«;"«J       section  10.3.5  and  cannot,  therefore, 

fhe  iJjhli-  r.Vu*        strictly  valid  inferences.    (It  should  be  noted  that 

f  ^"  inferences  based  on 

item-level  data.) 

H-r-J;<J:''SrlP^^"!  ^^"^  construct  validity  of  NAEP,  it  is  important  to 
determine  whether  the  patterns  of  proficiency  means  for  the  primary 
demographic  groups  are  consistent  with  educational  theory.    As  a  rather 

bi^'JrllJ.rrj  Grade  n  is  expected  to  be  highest,  followed 

by  Grades  8  and  4  in  that  order.    Of  course,  confirmation  of  this 
expectation  is  not  sufficient  to  demonstrate  validity;  disconfirmation. 
however,  would  cast  serious  doubt  on  the  results.    In  the  following 
section,  details  are  provided  on  grade  differences  and  other  group 
iJnJr?!!''*^  which  theory-based  hypotheses  could  be  formSd.    The  mean 

proficiency  values  on  which  this  section  is  based,  along  with  u.elr 
^JrJSI  given  in  Table  14.1(1).    (The  sources  of  these  data 

noa  \    I  a        S^f°l  ^"^"gf^^^  Toward  Excellence  in  Our  Schools 

zlzl:  ?V  fl  I'u"*'  V^"^  .^f^^^eg  Achievement  in  Am^rlr^n 
Schools,  1984  lApplebee.  Utngpr.  &  M..1  Ho    i%b]  )  

grade.    For  both  reading  and  writing,  the  proficiency  means 
tor  the  three  grades  are  appropriately  ordered.    In  both  subject 
areas,  the  difference  between  Grades  8  and  11  is  smaller  than  the 
difference  between  Grades  4  and  8.     This  is  consistent  with 
expectations  for  two  reasons:    First  of  all,  the  Grade  8  and  11 
students  are  only  three  grades  apart,  whereas  the  Grade  4  and  8 
students  are  four  grades  apart.    Also,  theories  of  cognitive 
development  predict  greater  improvement  in  reading  and  writing 
proficiency  between  Grades  4  and  8  than  in  the  teenage  years. 


529 


546 


Table  U.l(l) 


Reading  and  Writing  (ARM)  Proficiency  Means  for  Selected  Groups 
(standard  errors  in  parentheses) 


Total 


Grade  4 
Reading  Vriting 


Grade  8 
Reading  Vriting 


Grade  11 
Reading  Vriting 


217. 5(  .7)    1.58(.01)       260.7(  .5)    2.05(.01)       289. 3(  .8)  2.19(.01) 


Sex 
Hale 
Female 

Parental  Education 
Did  not  graduate  H.S. 
Graduated  H.S. 
Post  B.S. 

Size/Type  of  Community 
Rural 

Disadvantaged  urban 
Advantaged  urban 


215. 1(  .9)  1.50(.01) 

220. 0(  .7)  1.66(.01) 

200.2(1.2)  1.43(.03) 

215. 5(  .8)  1.54(.01) 

227.4(1.1)  1.66(.01) 


207.7(2.5) 
198.5(1.5) 
234.5(2.3) 


1.53(.02) 
1.42(.02) 
1.70(.02) 


257. or  .6)  1.96(.01) 
264. 3(  .6)  2.14(.01) 


244. 2(  .7)  1.89(.02) 
255. 5(  .7)  2.02(.01) 
271. 8(  .7)  2.13(.01) 


259.9(2.3)  2.03(.03) 
7.41.6(1.9)  1.88(.02) 
276.9(2.6)  2.21(.02) 


284.5(1.0)  2.09(.01) 
294. 3(  .9)  2.29(.01) 


269.5(1.2)  1.99(.02) 
281. 8(  .7)  2.15(.01) 
300. 6(  .9)  2.27(.01) 


284.6(3.2)  2.13(.03) 
267.8(2.5)  2.01(.02) 
300.2(3.0)  2.28(.02) 


547 


530 


ERIC 


Sex*    Research  In  cognitive  development  has  consistently 
demonstrated  that  the  verbal  skills  of  girls  are  superior  to  those 
of  boys.    This  held  true  for  the  NAEP  results  in  both  reading  and 
writing*    The  magnitude  of  the  superiority  increased  slightly  with 
grade  level,  which  is  consistent  with  the  findings  of  some 
prominent  researchers  (see  Jacklin,  1979;  Mussen,  Conger,  &  Kagan, 
1969). 

Parental  Education.    Previous  research  has  consistently  shown 
that  student  achievement  tends  to  be  highest  for  those  whose 
parents  have  received  the  most  education  (e.g.,  Jones,  Burton,  & 
Davenport,  1982).    The  NAEP  results  provided  an  additional 
confirmation  of  this  for  both  reading  and  writing,  within  each 
grade.    It  should  be  noted  that  the  NAEP  data  on  parent  education 
are  based  on  students'  reports.    The  category  definitions  for 
parental  education    (given  in  abbreviated  form  in  Table  14.1(1)) 
are;    (1)  neither  parent  graduated  from  high  school,  (2)  at  least 
one  parent  graduated  from  high  school  (but  neither  parent  received 
post-high  school  education),  and  (3)  at  least  one  parent  received 
some  post-high  school  education. 

Size  and  Type  of  Community.    NAEP  reading  and  writing  results 
were  reported  for  three  community  types  of  special  interest, 
defined  as  follows: 

Rural  communities:    Students  in  this  group  attend 
schools  in  areas  with  a  population  under  10,000  where 
many  of  the  residents  are  farmers  or  farm  workers. 

Disadvantaged  urban  communities:    Students  in  this 
group  attend  schools  in  or  around  cities  having  a 
population  greater  than  200,000  where  a  high  proportion 
of  the  residents  are  on  welfare  or  are  not  regularly 
employed. 

Advantaged  urban:    Students  in  this  group  attend 
schools  in  or  around  cities  having  a  population  greater 
than  200,000  where  a  high  proportion  of  the  residents 
are  in  professional  or  managerial  positions. 

(Note  that  only  about  a  third  of  the  NAEP  respondents  lived 
in  communities  that  fell  into  one  of  these  categories.)    As  would 
be  expected,  based  on  these  definitions,  achievement  was  higher 
for  advantaged  urban  students  than  for  disadvantaged  urban 
students  in  all  three  grades.    Expectations  for  rural  students 
were  less  clear.    In  fact,  their  achievement  levels  were 
consistently  above  the  disadvantaged  v.rbau  group,  but  below  the 
advantaged  urban  students. 


531 


ERIC 


14.1  2.3    Validity  Evidence  Based  on  Correlations  vith  Attitude  Variables 
and  PSAT  Scores 


Construct  validity  investigations  typically  include  examination  of  the 
correlations  between  test  scores  and  other  variables  of  interest  to 
determine  whether  they  are  consistent  with  a  hypothesized  pattern.  For 
instance,  scores  on  a  reading  test  would  be  expected  to  correlafa  more 
highly  with  other  reading  measures  than  with  scores  on  a  math  test.  In 
NAEP,  however,  conventional  reading  and  writing  scores  are  not  available. 
If  the  plausible  values  are  used  in  computing  correlation  coefficients,  the 
correlation  estimates  will  be  severely  attenuated  because  the  plausible 
values  do  not  represent  precise  estimates  of  proficiency  for  individual 
respondents.    Furthermore,  there  is  no  straightforward  way  to  achieve  a 
satisfactory  correction  for  attentuation.    (In  principle,  plausible  values 
could  have  been  constructed  to  take  into  account  the  joint  marginal 
characteristics  of  reading  and  writing.    This  would  have  prevented  the 
attenuation  problem.    However,  practical  considerations  necessitated  that 
plausible  values  be  constructed  separately  for  reading  and  writing.) 
Therefore,  an  alternative  approach,  described  below,  was  used  to  estimate 
correlations  between  reading,  writing,  and  other  variables  of  interest. 

As  an  illustration  of  the  method,  consider  the  correlation  between 
reading  and  writing  skills,  as  assessed  in  NAEP.    If  all  F  students  had 
answered  each  of  the  r  reading  items  and  each  of  the  w  writing  items,  the 
data  coald  be  represented  as  a  N  x  (r  +  w)  matrix,  X.    It  would  be  possible 
to  obtain  a  total  score  on  reading  (i.e.,  the  number  of  reading  items 
answered  correctly)  and  a  total  score  on  writing.    The  crossproducts  matrix 
corresponding  to  the  total  score  for  reading  and  the  total  score  for 
writing  could  be  computed  in  one  of  two  ways: 

(1)  First  compute  Y  =  XT,  where  T  is  a  (r  +  w)  x  2  transformation 
matrix  that  sums  the  reading  and  writing  item  scores.  The 
first  column  of  T  has  r  ones,  followed  by  w  zeroes.  The 
second  column  has  r  zeroes,  followed  by  w  ones.    The  N  x  2 
matrix  Y  contains  the  reading  and  writing  total  scores  for 
eac*  of  the  respondents.    Then  compute  C  =  Y'Y,  the  2x2 
crossproducts  matrix  of  the  total  scores  on  reading  and 
writing. 

(2)  Alternatively,  start  by  computing  A  =  X'X,  the  (r  +  w)- 
dimensional  crossproducts  matrix  of  the  reading  and  writing 
item  scores.    Then  obtain  C  by  computing  the  matrix  product 
T'AT. 


In  either  of  these  methods,  the  matrix  of  correlations  for  the  reading 
and  writing  total  scores  could  be  obtained  through  a  transformation  of  C. 

Because  the  complete  data  matrix  X  is  not  available  in  NAEP,  method  1 
cannot  be  applied.    However,  an  approach  similar  to  method  2  can  be  used. 
Although  A  =  X'X  cannot  be  computed,  BIB  spiralling  allows  the  computation 
of  the  matrix  A*  of  pairwise  crossproducts  of  the  r  +  w  items.  By 


532 


549 


substituting  A*  for  A,  the  approach  in  method  1  can  be  applied.    This  is 
the  approach  used  in  the  analyses  described  below.    A  subroutine  for 
obtaining  correlation  matrices  from  incomplete  data  was  first  applied, 
followed  by  the  Transform  Cross  Products  Matrix  (TCM)  algorithm  developed 
by  Beaton  (1964).    Because  most  of  the  missing  data  in  NAEP  are  a  result  of 
the  BIB  design  and  can  therefore  be  treated  as  missing  at  random,  the 
resulting  correlations  can  be  interpreted  in  essentially  the  same  manner  as 
correlations  between  total  test  scores  in  the  complete  data  case.  (Because 
conventional  reliability  formulas  do  not  apply  in  the  present  context, 
attempts  to  correct  these  TCM  correlations  for  attem^ation  may  lead  to 
misleading  results.    Therefore,  the  reported  correlations  have  not  been 
corrected  for  attenuation.    The  degree  of  attenuation  of  correlations 
computed  using  the  TCM  approach,  however,  is  much  less  severe  than  the 
attenuation  that  results  when  correlations  are  computed  using  the  plausible 
values . ) 

AH  reading  and  writing  items  used  in  the  NAEP  proficiericy  scales  were 
included  in  the  correlational  analyses,  provided  that  they  had  been 
BIB-spiralled  with  all  other  items.    For  reading  items,  this  criterion 
resulted  in  the  use  of  108  out  of  118  calibrated  items  for  Grade  4,  106  out 
of  124  for  Grade  8,  and  95  out  of  113  for  Grade  11.    In  the  case  of  writing 
items,  all  items  used  in  the  ARM  scaling  were  included:    8  items  for  Grade 
4,  10  for  Grade  8,  and  6  for  Grade  11.    Thus,  although  the  resulting 
correlations  do  not  directly  reflect  the  properties  of  the  reading  and 
writing  scales,  they  are  based  on  nearly  the  same  items. 

Only  a  limited  number  of  variables  were  available  for  correlational 
analysis  of  the  NAEP  data.    Of  primary  interest  was  the  correlation  between 
reading  and  writing.    This  correlation  was  expf^cted  to  be  moderately  high. 
In  addition,  16  background  and  attitude  items  were  selected  for  inclusion 
in  the  correlational  analysis.    These  items,  which  were  administered  using 
a  multiple  choice  format,  included  questions  about  the  language  spoken  in 
the  student's  home,  the  grades  received  by  the  student,  the  student's 
perceptions  of  his  or  her  ability  in  reading  and  writing,  and  the  amount  of 
reading  and  writing  done  by  the  student,  both  in  and  out  of  school.  Scores 
on  these  attitude  variables  were  not  summed;  instead,  the  correlation  of 
each  item  with  the  reading  and  writing  was  considered  separately.  For 
purposes  of  these  analyses,  responses  to  the  background  and  attitude  items 
were  re-coded  in  such  a  way  that  their  correlations  with  reading  and 
writing  were  expected  to  be  positive.    For  example,  on  items  that  asked 
students  how  often  they  read  stories  or  novels  in  their  free  time, 
responses  of  "almost  every  day**  received  the  highest  numerical  code; 
responses  of  "never  or  hardly  ever"  received  the  lowest.  These  correlations 
were  expected  to  be  of  low  to  moderate  size. 

Finally;  for  a  subset  of  Grade  11  students,  verbal  and  quantitative 
PSAT  scores  were  available.    These  were  obtained  without  violating 
confidentiality  requirements.    Lists  of  PSAT  scores  for  all  PSAT  takers 
within  a  school  were  provided  by  ETS  to  schools  participating  in  NAEP.  If 
students  selected  for  th«  NAEP  sample  had  taken  the  PSAT,  school  personnel 
entered  the  scores  onto  the  students'  NAEP  records.    ETS  did  not  retain  any 
information  about  the  identity  of  the  NAEP  participants. 


533 


550 


Ideal  evidence  for  construct  validity  would  be  a  finding  that 
correlations  of  reading  and  writing  with  PSAT  verbal  scores  were  quite 
high,  exceeding  the  correlation  between  PSAT  verbal  and  quantitative 
scores,  whereas  correlations  of  reading  and  writing  with  the  PSAT 
quantitative  scores  were  only  moderate.  Results  of  this  kind  could  be 
considered  as  informal  evidence  of  both  convergent  and  discriminant 
validity  (Campbell  &  Fiske,  1959).    Convergent  validation  shows  that  the 
measure  of  interest  is  highly  correlated  with  independent  measures  of 
similar  constructs.    Discriminant  validation  demonstrates  that  the  measure 
under  evaluation  is  not  highly  correlated  with  variables  from  which  it  is 
theoretically  expected  to  differ. 


U. 1.2.4    Correlations  of  Reading,  Writing,  PSAT  Scores,  and  Selected 
Background  Variables 

Separate  correlational  analyses  were  conducted  for  each  grade  and  for 
the  subsample  of  eleventh  graders  who  took  the  PSAT.    These  analyses  are 
reported  in  Tables  U.l(2),  (3),  and  (5).    Table  14.1(2)  shows,  for  each 
grade,  the  correlation  matrix  for  reading,  writing,  and  four  of  the  sixteen 
background  and  attitude  variables  included  in  the  analysis:  language 
spoken  in  the  home,  grades  in  school,  and  student  self-assessments  of 
ability  in  reading  and  writing,  respectively.    A  correlation  matrix  for 
these  six  variables,  as  well  as  the  PSAT  verbal  and  qua  titative  scores  is 
given  in  Table  14.1(3)  for  the  PSAT  subsample.    The  precise  definitions  of 
all  variables  used  in  these  analyses  are  given  in  Table  14.1(4).  Table 
14.1(5)  gives  the  correlations  of  reading  and  writing  with  the  remaining 
twelve  background  and  attitude  items,  which  concern  frequency  of  reading 
and  writing  activities,  for  Grades  4,  8,  and  11,  and  for  the  PSAT 
subsample.    The  item  texts  and  response  codes  for  these  twelve  items  are 
given  in  Table  14.1(6). 

The  correlational  analyses  were  based  on  approximately  20,000 
unweighted  observations  in  each  of  Grades  4,  8,  11,  and  8,500  observations 
in  the  PSAT  subsample.    Because  of  BIB  spiralling,  however,  the  number  of 
respondents  available  for  the  estimation  of  each  correlation  in  the 
original  matrix  was  much  less.    For  Grades  4,  8,  and  11,  most  correlations 
were  based  on  200  to  300  observations;  for  the  PSAT  subsample,  most 
correlations  were  based  on  about  100  observations. 

The  correlation  between  reading  and  writing,  wiich  was  of  primary 
interest,  was  .64  for  Grade  4,  .60  for  Grade  8,  .51  for  Grade  11,  and  .53 
for  the  PSAT  subsample.    The  size  of  these  correlations  is  generally 
consistent  with  expectations,  although  the  considerably  lower  correlations 
for  Grade  11  and  for  the  PSAT  subsample  were  somewhat  surprising.  (Standard 
errors  of  these  correlations  cannot  be  obtained  by  standard  methods. 
Jackknifed  staniard  errors  could  be  computed,  but  are  not  currently 
available.)    It  is  likely  that  the  smaller  correlations  result  from  greater 
homogeneity  among  Grade  11  NAEP  participants  than  among  the  fourth  or 
eighth  graders.    The  variability  of  number-right  scores  for  reading  and 
writing  can  be  estimated  using  the  TCM  method  described  above.    For  Grade 


534 


Table  14.1(2) 

Correlations  of  Reading,  Writing,  and  Selected  Background  Variables* 

Grade  4 

Home  Kind  of         Kind  of 

Reading  Writing        Language  Grades  Reader  Writer 

R  1.00 

W  .64  1.00 

HL  .15  .12  1.00 

G  .39  .28  .07  1.00 

KR  .31  .19  .05  .21  1.00 

KW  .02  -.09  -.01  .16  .22  1.00 

Grade  8 

R  1.00 

W  .60  1.00 

HL  .11  .10  1.00 

G  .47  .33  .02  1.00 

KR  .33  .23  .08  .26  1.00 

KW  .17  .15  .04  .20  .27  1.00 

Grade  11 

R  1.00 

W  .51  1.00 

HL  .16  .09  1.00 

G  .43  .32  .02  1-00 

KR  .33  .18  .08                 .24  1.00 

KW  .29  .25  .09                 .27             .31  1.00 

*Variables  are  defined  in  Table  14.1(4).    Methodology  used  to  obtain 
correlations  is  described  in  the  text. 


535 


o  552 
ERJC 


Table  1A*1(3) 


Correlations  of  Reading,  Writing,  PSAT  Scores,  and 
Selected  Background  Variables* 
(PSAT  subsample  only) 


Home  Kind  of    Kind  of 


Reading 

Writing 

PSAT-V 

PSAT-Q 

Language 

Grades  Reader 

Writer 

R 

1.00 

V 

.53 

1.00 

PV 

.67 

.32 

1.00 

PQ 

.57 

.26 

.67 

1.00 

HL 

..-0 

.10 

.15 

.08 

1.00 

G 

.50 

.35 

.46 

.50 

.05 

1.00 

KR 

.29 

.29 

.38 

.17 

.13 

.Ik  1.00 

KV 

.31 

.28 

.37 

.22 

.09 

.32  .53 

1.00 

*Variablps  are 

defined 

in  Table 

lA.l(A). 

Methodology  used  to  obtain 

correlations  is  described  in  the  text. 


!353 


Table  14.1(4) 


Definition  of  Variables  for  Analyses  in  Tables  14.1(2)  and  14.1(3) 

Reading:    All  calibrated  reading  items  were  included,  provided  that  they 
were  BIB-spiralled  with  all  other  items  (see  Chapter  5).  This 
criterion  resulted  in  the  use  of  108  out  of  118  calibrated  items  for 
Grade  4,  106  out  of  124  for  Grade  8,  and  95  out  of  113  for  Grade  11. 

Writing:    All  writing  items  used  in  the  ARM  scaling  were  included  (8  items 
for  Grade  4,  10  items  for  Grade  8,  and  6  items  for  Grade  11.) 

Home  Language:    What  language  do  you  speak  most  often  in  your  home? 

1  =r  Englisn 

0  =  Other 

Grades:    Which  of  the  following  best  describes  your  grades  so  far  in 
school? 

9  »  Mostly  A  (a  numerical  5  =  Mostly  C  (70-79) 

average  of  90  -  100)  4  =  Both  C  and  D 

6  ^  Both  A  and  B  3     Mostly  D  (60-69) 

7  =  Mostly  B  (80-89)  2  ^  Both  D  and  E 

6  =  Both  B  and  C  1  =  Mostly  below  D  (below  60) 

Kind  of  Reader:    What  kind  of  reader  do  you  think  you  are? 

3  ss  A  very  good  reader 

2  -  A  good  reader 

1  =  A  poor  reader 

Kind  of  Writer:     [Instructions  -  The  following  sentences  are  true  for  some 
people.    They  may  or  may  not  be  true  for  you,  or  they  may  be  true  for 
you  only  part  of  the  time.    How  often  is  each  of  the  following 
sentences  true  for  you?]    I  am  a  good  writer. 

5  =  Almost  always  2  =r  Less  than  half  the  time 

4  s  More  than  half  the  time  1  =s  Never  or  hardly  ever 

3  =  About  half  the  time 

PSAT-V:    Score  on  verbal  section  of  Preliminary  Scholastic  Aptitude  Test 

PSAT-0:    Score  on  quantitative  section  of  Preliu.xnary  Scholastic  Aptitude 
Test 


537 


554 


Table  U.}(5) 


Correlations  of  Reading  and  Writing  with 
Frequency  of  Reading  and  Writing  Activities* 


PSAT 

Grade  4  Grade  8  Grade  11  Subsample 

Reading  Activities  RWRWRW  RW 

1.  During  free  time,  how  often 

do  you  read  a  book?  .17    .20       .19      .23       .18      .07         .19  .02 

2.  Turing  free  time,  how  often 
do  you  read  a  newspaper 

or  magazine?  .02    .09       .12      .11       .17      .16         .17  .10 

3*  How  many  pages  do  you  read 

in  school  and  for  homework?    .09    .06       .10      .10       .21      .16         .16  .12 

4.  How  often  do  you  read 

a  story  or  novel?  .20    .06       .37      .26       .27      .17         .31  .10 

5.  How  often  do  you  read 

a  newspaper?  .10  -.06       .17      .22       .17      .15         .09  .03 

6.  How  often  do  you  read 

a  magazine?  .02  -.02       .13      .13       .09      .07         .11  .00 

7.  How  often  do  you  read  for 

fun  on  your  own  time?  .23    .11       .30      .28       .22      .10         .22  .21 

8.  How  often  do  you  read  on 

your  own  in  school?  .21    .16       .14      .24       .12      .14         .15  .04 

Writing  Activities 

9.  How  much  of  English  class 
is  spent  learning  to  write?  -.22 

10.  How  many  stories  did  you 
write  for  English 
last  week?  -.11 

11.  How  many  writings  did  you 
do  last  week  that  were 

not  for  school?  .06      .01       .09      .03       .08  .03 

12.  How  often  do  you  write 
stories  or  poems  that 

are  not  for  school?  -.14    -.11     -.12    -.02       .00      .02         .02  -.12 


.10     -.01  .02 


.08     -.13  -.06 


.07 


.23 


.07 


.11 


.08 


-.20 


.02 


.11 


.07 


.11 


*Item  texts  and  response  codes  are  given  in  Table  14.1(6).  Methodology  used  to 
obtain  correlations  is  described  in  the  text. 


ERLC 


538 


t>0  J 


Table  14.1(6) 


Item  Text  and  Response  Codes  for  Reading  and  Writing 


When  you  have  free  time,  how  often  do  you  do  each  of  the  following? 

1.  Read  a  book 

2.  Read  a  newspaper  or  magazine 

3  =  Every  day  or  almost  every  day 
2  a  About  once  a  week 
1  =  Once  a  year  or  less 

3.  About  how  many  pages  a  day  do  you  have  to  read  in  school  and  for 

homework? 


5  =  More  than  20 
4  =  16  -  20 
3  =  11  -  15 
2  =  6-10 
1  =    5  or  less 


How  often  do  you  read  each  of  the  following? 

4.  Part  of  a  story  or  novel 

5.  A  newspaper 

6.  A  magazine 


5  =  Almost  every  day 
4  =  Once  or  twice  a  week 
3  =  Once  or  twice  a  month 
2  =  A  few  times  a  year 
1  =  Never  or  hardly  ever 


How  often  do  you  do  each  of  the  following  things? 

7.  Read  for  fun  on  your  own  time 

8.  Read  on  your  own  in  school 


5  =  Almost  every  day 
4  =  Once  or  twice  a  week 
3  =  Once  or  twice  a  month 
2  =  A  few  times  a  year 
1  =  Never  or  hardly  ever 


539 

556 

ERIC 


Table  14-1(6) 
(continued) 


9.    About  how  much  of  your  time  in  English  class  is  spent  learning  to 
write? 


5  =  Most  of  the  time 

4  =  More  than  half  the  time 

3  =  About  half  the  time 

2  =  Less  than  half  the  time 

1  =  None  or  almost  none  of  the  time 

About  how  many  of  each  of  the  following  kinds  of  writing  did  you 
do  for  your  English  class  last  week? 
10.    A  story 


3  =  3  or  more 
2  =  1  or  2 
1  =  None 


11.    About  how  many  times  during  last  week  did  you  write  something 
that  was  NOT  a  school  assignment? 


4  =  3  or  more 
3  =  2 
2=1 
1  =  None 


How  often  do  you  write  each  of  the  following  things? 
12.    Stories  or  poems  that  are  not  schoolwork 


4  =  Almost  every  day 
3  =  Once  or  twice  a  week 
2  =  Once  or  twice  a  month 
1  =  Never  or  hardly  ever 


540 

ERLC 


11,  the  standard  deviations  were  15.0  and  2.6  for  reading  and  writing, 
respectively;  for  the  PSAT  subsample,  the  corresponding  values  were  10,9 
and  2»0.    The  standard  deviations  for  Grades  8  and  4  were  substantially 
larger!    16,9  and  3-7  for  Grade  8  and  21,2  and  3-6  for  Grade  4-  The 
smaller  variability  in  Grade  11  probably  occurred  in  part  because  some  low 
achievers  drop  out  of  school  before  Grade  11.    In  addition,  the  r^te  of 
participation  in  NAEP  was  somewhat  smaller  for  Grade  11  students  than  for 
students  in  Grades  4  or  8,  which  could  further  restrict  the  range  of 
reading  and  writing  proficiency  in  the  Grade  11  sample.    Because  only  a 
select  subgroup  of  student?  take  the  PSAT,  the  variability  for  this 
subsample  was  still  smaller. 

The  findings  based  on  PSAT  scores  were  moderately  supportive  of  the 
validity  of  the  NAEP  reading  and  writing  assessments.    Reading  had  quite  a 
high  correlation,  .67,  with  PSAT  verbal  scores,  and  a  lower  correlation, 
.57  with  PSAT  quantitative  scores.    Similarly  the  correlation  between 
writing  and  PSAT  verbal  scores  was  .32,  which  was  higher  than  the 
correlation  of  .26  between  writing  and  PSAT  quantitative  scores.  However, 
the  large  correlation  of  reading  and  PSAT-.V  was  matched  in  size  by  the 
correlation  of  PSAT-V  with  PSAT-Q.    Also,  the  correlation  of  .57  between 
reading  and  PSAT-Q  was  slightly  larger  than  the  correlation  between  reading 
and  writing,  which  was  .53.    In  interpreting  these  patterns  of 
correlations,  it  ir  necessary  to  keep  in  mind  that  the  small  number  of 
writing  items  and  the  consequent  low  reliability  result  in  the  attenuation 
of  the  correlations  of  writing  with  other  variables. 

The  correlations  of  reading  and  writing  with  the  background  and 
attitude  items  included  in  Tables  14.1(2)  and  (3)  were,  for  the  most  part, 
small  to  moderate  positive  correlations.    In  general,  reading  had  higher 
correlations  with  these  items  than  writing.  This  was  not  unexpected,  given 
that  there  were  many  more  reading  than  writing  items,  resulting  in  higher 
reliability.    The  correlations  of  reading  and  writing  with  home  language 
were  small,  ranging  from  .11  to  .20  for  reading  and  .09  to  .12  for  writing. 
Correlations  with  self-reported  grades  in  school  were  moderate,  ranging 
from  .39  to  .50  for  reading  and  from  .28  to  .35  for  writing.    The  question, 
"What  kind  of  reader  do  you  think  you  are?"  had  correlation?  ranging  from 
.29  to  .33  with  reading  and  .18  to  .29  with  writing.    The  question  asking 
the  students  about  their  writing  ability  had  correlations  ranging  from  .17 
to  .31  with  reading  and  from  .15  to  .28  with  writing  in  the  Grade  8,  Grade 
11,  and  PSAT  subsamples.    However,  in  Grade  4,  the  corresponding 
correlations  were  .02  and  -.09.    A  clue  to  this  discrepancy  is  provided  in 
Table  14.1(4):    The  instructions  and  phrasing  of  the  question  are  probably 
confusing  to  fourth  graders.    In  all  three  grades,  the  correlation  of  the 
"kind  of  writer"  question  with  reading  was  slightly  higher  then  the 
correlation  with  writing.    This  may  result  from  the  greater  number  of 
reading  items. 

Results  for  the  twelve  additional  background  and  attitude  items 
included  in  the  analysis  are  shown  in  Table  14.1(5).    These  items  pertained 
to  the  frequency  with  which  students  participated  in  reading  and  writing 
activities,  both  in  and  out  of  school.    For  items  pertaining  to  reading 
activities,  most  correlations  were  small  and  positive.    The  items  that  were 


5il 


ERIC 


558 


moit  highly  correlated  with  reading  and  writing  pertained  to  the  frequency 
of  reading  books  during  free  time,  reading  stories  or  novels,  reading  for 
fun,  and  reading  on  one's  own  in  school.    Most  of  the  reading  activity 
items  were  more  highly  correlated  with  reading  than  with  writing.    This  was 
expected,  both  because  the  items  referred  to  reading  activities  and  because 
more  reading  than  writing  items  were  included  in  the  analysis. 

Results  for  the  items  pertaining  to  writing  activities  were  much  more 
puzzling.    For  items  9,  10,  and  12,  most  of  the  correlations  with  reading 
and  writing  were  negative.    In  considering  some  related  results,  the 
authors  of  The  Writing  Report  Card  speculated  that  English  teachers  may 
assign  more  writing  to  low-achieving  students  than  to  skilled  students. 
This  would  not,  however,  explain  the  negative  correlations  for  item  12, 
which  refers  to  writings  that  are  not  schoolwork.    It  could  be  hypothesized 
thf»'  there  is  a  tendency  for  low  achievers  to  overstate  their 
accv-.?lishments,  resulting  in  negative  associations  of  proficiency  with 
self-reported  frequency  of  reading  and  writing.    This  still  would  not 
account  for  the  fact  that  item  11,  which  appears  to  be  similar  to  item  12, 
has  positive  correlations  with  reading  and  writing. 

The  analyses  of  Tables  14.1(3)  and  (5)  were  repeated  within  subsamples 
based  on  sex  and  ethnicity.    For  Grade  8,  analyses  were  conducted 
separately  for  males  and  females.    Both  this  analyses  and  previous 
correlational  analyses  for  all  three  grade  samples  showed  that  the 
correlational  structure  for  males  and  females  was  essentially  the  same. 
For  each  of  the  three  grades,  correlational  analyses  were  also  conducted 
separately  for  whites,  blacks,  and  Hispanics.    At  each  grade  level*  the 
unweighted  sample  size  for  these  correlational  analyses  was  13,000  to 
1.7,000  for  whites,  slightly  over  3,000  for  blacks,  and  2,000  to  3,000  for 
Hispanics.    The  typical  number  of  respondents  available  to  estimate  each 
coefficient  in  the  original  correlation  matrices  was  about  150  to  200  for 
the  white  samples,  about  35  for  the  black  samples,  and  about  25  for  the 
Hispanic  samples. 

Although  some  ethnic  group  differences  were  evident,  there  were  few 
consistent  or  interpretable  patterns.    One  group  difference  that  did  show 
some  consistency  involved  the  correlations  between  reading  and  writing, 
displayed  below. 

Grade  A     Grade  8     Grade  11 

Hispanic  .66  .70  .54 

White  .61  .57  .49 

Black  .64  .48  .37 


In  Grades  8  and  11,  the  correlation  for  Hispanics  was  higher  th  .n  the 
correlation  for  whites,  which,  in  turn,  was  considerably  higher  chan  that 
for  blacks.    In  Grade  4,  all  ethnic  groups  had  similar  correlations.  This 
finding  may  be  attributable  in  part  to  differences  across  ethnic  groups  in 
the  range  of  reading  and  writing  proficiency.    Variability  in  reading  and 
writing  tended  co  be  somewhat  higher  in  Hispanics  than  in  blacks  and 


542 


vhites*    Another  finding  that  was  consistent  across  grades  was  the  ethnic 
group  differences  in  the  correlations  of  reading  activity  items  (1-8  on 
Table  14.1(5))  with  reading  and  writing.    These  correlations  tended  to  be 
smaller  for  blacks  than  for  whites  and  Hispanics.    Also,  the  correlations 
of  items  9,  10,  and  12  with  reading  and  writing  tended  to  be  somewhat  more 
negative  for  blacks  and  Hispanics  than  for  whites. 

It  is  difficult  to  assess  the  relevance  of  the  correlational  analyses 
of  Table  1A.1(5)  to  the  validity  of  the  NAEP  reading  and  writing  assessment 
because  the  validity  of  the  responses  to  the  reading  and  writing  activity 
items  is  itself  in  question.    That  is,  no  data  are  available  on  the 
relation  between  these  student  reports  and  the  actual  frequency  of  reading 
and  writing  activities.    Therefore,  correlations  based  on  these  items  (and 
on  the  self-reported  items  included  in  Table  14.1(3))  can  not  be  given  as 
much  weight  in  the  validity  assessment  as  the  content-relaied  evidence,  the 
correlations  between  reading,  writing,  and  PSAT  scores,  and  the  patterns  of 
proficiency  means  across  sociodemographic  groups. 


14.1.3    Summary  of  Validity  Evidence 

The  NAEP  reading  and  writing  exercises  were  the  products  of  an 
elaborate  item  development  process,  \rihich  included  the  participation  of 
subject  area  experts,  curriculum  speclalisrs,  teachers,  school 
administrators  and  parents,  in  addition  to  the  NAEP  committees  on  reading 
and  writing.    The  process  involved  developing  educational  objectives, 
preparing  items  corresponding  to  these  objectives,  field  testing  the  items, 
and  finally,  selecting  items  for  inclusion  in  the  assessment.    The  final 
set  of  items  was  judged  to  be  consistent  with  the  specified  content  domains 
and  to  conform  to  established  principles  of  test  development.    All  these 
assessment  items  are  available  for  item-level  analyses.    However,  because 
of  practical  and  methodological  considerations,  only  a  subset  of  the  items 
was  included  in  the  reading  and  writing  proficiency  scales.  Therefore, 
these  scales  do  not  fully  correspond  to  the  established  reading  and  writing 
domains. 

In  general,  the  construct-related  evidence  based  on  analysis  of 
proficiency  means  and  correlations  appears  to  support  the  validity  of  the 
NAEP  reading  and  writing  assessments.    Differences  across  sociodemographic 
groups  based  on  grade,  sex,  parental  education,  and  size  and  type  of 
community  tended  to  be  consistent  with  findings  of  previous  research  and 
with  theories  of  cognitive  development. 

Correlational  analyses  of  reading,  writing,  PSAT  scores,  and  selected 
background  and  attitude  items  produced  an  unexpected  finding:    Three  of 
four  items  pertaining  to  frequency  of  writing  activities  wer^  negatively 
correlated  with  reading  and  writing.    However,  most  of  the  correlational 
results  were  quite  supportive  of  the  validity  of  the  NAEP  reading  and 
writing  assessment.    The  correlation  between  reading  and  writiiig  was 
moderitely  high,  as  expected,  and  that  reading  and  wri'^i'ig  were  both  more 


543 


560 


highly  correlated  with  the  PSAT  verbal  scores  than  with  the  PSAT 
quantitative  scores.    Reading  and  writing  had  moderate  correlations  with 
self-reported  grades  and  snail  to  moderate  correlations  with  student 
self-assessments  of  reading  and  writing  ability. 


544 


Chapter  14.2 
DESIGN  EFFECTS^ 


Eugene  G.  Johnson 
Educational  Testing  Service 


The  ftajor  computational  load  in  measuring  uncertainty  for  any  statistic 
is  in  the  estimation  of  the  uncertainty  due  to  sampling  variability.  The 
jackknife  variance  estimation  procedure  requires  that  the  statistic  be 
repeatedly  recomputed  to  obtain  an  estimate  of  the  sampling  variance  of  the 
statistic.    In  the  current  design,  this  involves  33  cc:  -  tations,  once  for 
the  overall  estimate  and  once  for  each  of  the  32  PSU  pairs.    If  the 
population  value  of  interest  is  based  on  proficiency  values^  so  that  the 
statistic  is  computed  on  a  set  of  plausible  values-  then^  for  reasons  given 
in  Chapter  10.3,  the  entire  process  shoulv       repeated  once  for  each  set  of 
plausible  values. 

This  section  describes  hov  to  approximate  the  sampling  variability  of 
any  statistic  by  less  computationally  intensive  methods.    In  the  case  that 
the  statistic  is  based  on  proficiency  values^  the  method  will  provide  an 
estimate  of  the  sampling  variability  for  the  statistic.    The  component  of 
variability  due  to  imputation  should  still  be  estimated  by  recomputing  the 
statistic  on  different  sets  of  plausible  values  (see  Seccion  13.3.3). 

It  is  inappropriate  to  estimate  the  sampling  variability  of  any 
statistic  based  on  the  NAEP  database  by  using  simple  random  sampling  (SRS) 
formulas.    These  formulas ^  which  are  the  ones  used  by  most  standard 
statistical  software  such  as  SPSS  and  SAS,  will  produce  variance  estimates 
which  are  generally  much  smaller  than  is  warranted  by  the  sample  design. 

It  may  be  possible  to  account  approximately  for  the  effects  of  the 
sample  design  by  using  an  inflation  factor,  the  design  effect,  developed  by 
Kish  (1967)  and  extended  by  Kish  and  Frankel  (1974).    The  design  effect  for 
a  statistic  is  the  ratio  of  the  actual  variance  of  the  statistic  (taking 
the  sample  design  into  account)  over  the  conventional  variance  estimate 
based  on  a  simple  random  sample  with  the  same  number  of  elements.    To  avoid 
sources  of  bias  due  to  improper  representation,  this  conventional  estimate 
must  U8tt  the  sampling  weights.    The  design  effect  may  be  used  to  adjust 
error  estiMtes  based  on  simple  random  sampling  assumptions  to  account 


The  statistical  programming  for  this  section  was  provided  by  Bruce 
Kaplan,  David  Preund,  and  Laurel  Barnett.    The  figures  were  produced  by  Ira 
Sample. 


545 


562 


approximately  for  the  effect  of  the  design.    In  practice,  this  is  often 
accomplished  by  dividing  the  total  sample  size  by  the  design  effect  and 
using  this  effective  sample  size  in  the  computation  of  errors.    Note  that 
the  value  of  the  design  effect  depends  on  the  type  of  statistic  computed 
and  the  variables  considered  in  a  particular  analysis  as  well  as  the 
combined  clustering,  stratification,  and  weighting  effects  occurring  among 
sampled  elements. 

Based  on  empirical  results  and  theoretic  considerations,  Kish  and 
Prankel  (197A)  have  developed  several  conjectures  about  design  effects: 

(1)  Generally,  the  design  effects  for  complex  statistics  from  complex 
samples  are  greater  than  one,  causing  variances  based  on  simple 
random  sampling  assumptions  to  tend  to  be  underestimates. 

(2)  The  design  effects  for  complex  statistics  (such  as  regression 
coefficients)  tend  to  be  smaller  than  the  corresponding  design 
effects  for  means  of  the  same  variables.    Hence,  the  design  effects 
for  means,  which  are  more  easily  computed,  tend  to  give 
overestimates  of  the  design  effects  of  complex  statistics. 

(3)  Qualitatively  and  comparatively,  the  design  effects  of  complex 
statistics  tend  to  resemble  those  of  means;  variables  with  a  high 
design  effect  of  the  mean  also  tend  to  have  high  design  effects  for 
complex  statistics  involving  those  variables. 

To  incorporate  the  design  effect  idea  in  a  statistical  analysis, 
proceed  in  the  following  manner: 

(1)  For  a  givon  class  of  statistics  (e.g.,  means,  percentile  points, 
regression  coefficients),  compute  the  jackknife  variance  for  a 
number  of  cases  corresponding  to  the  estimate  of  a  particular 
statistic  from  a  specified  subgroup  of  the  population.    The  cases 
should  cover  the  range  of  situations  for  which  the  approximation  is 
to  be  used.    If  various  subpopulations  are  to  be  consideied,  it  is 
important  to  have  information  on  the  relative  variability  within 
each  subgroup.    This  is  especially  important  if  certain  subgroups 
are  more  highly  clustered  in  the  sample. 

(2)  For  the  identical  cases,  compute  the  conventional  estimate  of  the 
variance.    This  estimate  must  tak'  the  sample  weights  into  account 
to  avoid  problems  of  bias  due  to  improper  representation.  To 
account  properly  for  the  difference  between  the  number  of 
individuals  being  sampled  and  the  total  of  L..e  sampling  weights, 
the  weights  should  be  scaled  so  that  their  sum  equals  the  sample 
size. 

(3)  For  each  case,  compute  the  design  effect  where  the  design  effect 
for  case  j  is 

deff^  =  Var^,(t.)/Var^,,(t)^ 

546 


Er|c  563 


the  ratio  of  the  jackknife  variance  estimate  of  the  statistic  to 
its  conventional  variance  estimate* 


(4)  If  the  design  effects  for  the  various  cases  are  tolerably  similar^ 
choose  an  overall  composite  design  effect.    If  the  design  effects 
for  certain  subgroups  appear  to  cluster  around  a  markedly  different 
value  from  the  remaining  cases,  treat  those  subgroups  separately. 

(5)  In  the  case  that  a  consistent  overall  design  effect  has  been  found: 

(a)    rescale  the  weight  of  each  individual  so  that  the  sum  o^  the 
scaled  weights  is  equal  to  the  effective  sample  size 


(b)    conduct  a  traditional  weighted  analysis  using  these  scaled 
weights 

(6)  The  degrees  of  freedom  for  any  variance  estimates  obtained  by  using 
this  approach  is  still  at  most  32 f  the  number  of  PSD  pairs ,  as  i t 
was  for  the  jackknife.    Accordingly,  tests  of  significance  produced 
by  standard  programs  (which  will  use,  for  the  error  degrees  of 
freedom,  the  effective  sample  size  minus  the  number  of  parameters) 
should  be  interpreted  with  extreme  caution  because  they  are  likely 
to  bfe  too  liberal.    Significance  and  inferential  procedures  are 
properly  based  on  the  smaller  error  degrees  of  freedom  (32). 


14.2.1    Some  Design  Effects  from  the  Year  15  Reading  Assessment 

As  an  example  of  the  distribution  of  design  effects  to  be  expected  from 
NAEP  data,  we  consider  the  design  efrect  for  the  key  statistic,  P,  the 
estimated  proportion  of  a  specified  subgroup  of  the  population  who  would 
correctly  respond  to  a  given  assessment  exercise.    This  estimate,  which  is 
a  weighted  mean  of  the  responses  of  individuals  in  the  subgroup  to  the 
exercise  (where  an  individual's  rc;sponse  is  either  0  or  1),  has  a  Jesign 
effect  of  the  form 


In  the  above,  N  is  the  total  number  of  individuals  in  the  subgroup 
responding  to  the  exercise,  Varjj^(P)  is  the  jackknife  variance  of  P,  and 
?0  -  P)/N  is  the  conventional  variance  estimate  of  P.    (Although  the 
estimate  P(l  -  P)/N  has  the  same  form  as  the  simple  random  sampling 
estimator  of  the  variance  of  P,  the  sample  weights  have  been  taken  into 
account  via  the  weighted  estimation  of  P.) 


N 


•  f  f 


sample  size 
design  effect 


deff(P)  =  Varjj,(P)/(P(l  -  P)/N) 


547 


ERJC 


The  distributions  of  design  effects  for  proportions  correct  by  grade 
and  by  deaographic  subgroup  within  grade  across  all  cognitive  reading 
itenspresented  in  the  Year  15  assessment  are  indicated  in  Figures  14.2-la 
through  14.2-3C,  and  Tables  14.2(1)  through  14.2(3). 

Table  14.2(1)  addresses  the  distributions  of  the  design  effects  for  the 
131  cognitive  reading  exercises  presented  to  Grade  4  students  as  whole 
("total**)  as  well  as  for  a  variety  of  demographic  subgroups:  sex; 
race/ethnicity  (White,  Black,  Hispanic,  other);  region  (Northeast, 
Southeast,  Central,  West);  parental  education  (At  Most  High  School, 
v^raduated  High  School,  Post-High  School,  Unknown);  and  Size  and  Type  of 
Community  (Rural,  Low  Metropolitan,  High  Metropolitan,  Big  City,  Urban 
Fringe,  Medium,  City,  Small  Place).    For  each  of  these  groupings  of  Grade  4 
students,  Table  14.2(1)  provides  the  lower  quartile  (LoQ),  median,  upper 
quartile  (HiQ)  and  maximum  design  effect  as  veil  as  the  nean  design  effect 
and  the  percent  of  design  effects  less  than  2  and  2.5. 

A  graphical  display  of  the  distributions  of  the  design  effects  for  the 
same  sets  of  students  appears  as  the  boxplots  (strictly,  box-and-vhiskers 
plots)  shown  in  Figure*-  14,2-la  through  14.2-lc.    The  left  and  right 
margins  of  the  box  in  each  boxplot  correspond  to  the  ''.over  and  upper 
quar tiles,  the  vertical  line  within  the  box  to  the  median;  the  minimum  and 
maximum  valuer  are  indicated  by  the  ends  of  the  horizontal  lines  (see 
Tukey,  1977  for  further  details).    Because  the  distributions  of  the  design 
effects  are  badly  skewed,  the  plots  were  symmetrized  by  plotting  the  log 
(base  10)  of  the  quantiles  of  the  design  effects. 

Equivalent  information  on  the  distributions  of  design  effects  for  the 
130  cognitive  reading  exercises  presented  to  Grade  8  students  appears  as 
Table  14.2(2)  and  Figures  14.2"2a  through  14.2-2c.    The  116  cognitive 
reading  items  presented  to  Grade  11  students  are  addressed  by  Table  14.2(3) 
and  Figures  14.2-3a  through  14.2-3c. 

The  particular  demographic  variables  shown  (sex,  race/ethnicity, 
region,  parental  education,  and  size  and  type  of  community)  were  selected 
because  (1)  they  are  major  variables  in  NAEP  reports  and  (2)  they  reflect 
different  types  of  divisions  of  the  population  which  might  have  different 
levels  of  sampling  variability. 

The  tables  and  figures  show  that  the  design  effects  are  predominantly 
larger  than  1,  indicating  that  stiindard  variance  escimation  formulas  will 
be  generally  too  small,  sometimes  markedly  so.    Further,  the  distributions 
of  design  effects  appear  diff errant  for  certain  subgroups  of  the  population. 

A  striking  feature  of  the  tables  is  the  apparent  lower  sampling 
variability  of  the  Grade  8  data  relative  to  the  other  two  grades.  In 
nearly  every  case,  the  median  design  effect  for  a  subgroup  based  on  Grade  8 
data  is  smaller  than  the  equivalent  medians  for  the  other  two  grades. 
(This  is  also  true  for  the  upper  quar tiles.)    In  contrast,  the 
distributions  ot  design  effects  for  Grades  4  and  11  appear  quite 
similar—in  exactly  half  of  the  22  cases  the  median  design  effect  for  Grade 
11  exceeds  tha^  for  Grade  4. 


548 


565 


The  smaller  design  effects  for  Grade  8  indicate  that  the  effects  of  the 
sample  design  on  variance  estimation  is  less  for  Grade  8  than  for  the  other 
two  grades.    Since  a  major  determinant  of  the  sampling  variability  of  a 
statistic  is  the  degree  of  clustering  in  the  sample,  this  would  appear  tc 
be  a  surprising  result.    The  major  clustering  in  the  sample  is  students 
within  schools.    Because  the  number  of  schools  selected  for  assessment 
decreases  by  grade,  with  661  schools  selected  at  Grade  4,  478  at  Grade  8 
and  326  at  Grade  11|  the  number  of  students  selected  within  a  school  to 
respond  to  a  given  exercise  increases  by  grade.    On  average,  the  number  of 
students  within  a  school  responding  to  a  given  exercise  is  roughly  three 
for  Grade  4,  five  for  Grade  8  and  eight  for  Grade  11.    All  else  being 
equal y  this  would  imply  thac  the  design  effects  for  Grade  4  would  tend  to 
be  the  snallest,  those  for  Grade  11  the  largest ^  and  those  for  Grade  8 
in-between.    However,  it  is  not  only  the  sample  cluster  size  that  counts, 
but  the  heterogeneity  of  the  full  clusters,  before  subsampling,  that 
influences  the  result.    Grade  11  schools  are  larger  and  more  heterogeneous, 
and  thi^  latter  effect  would  reduce  design  effect  for  them.    But,  since 
these  schools  have  larger  sample  clusters,  this  gain  is  more  than  offset. 
Grade  4  has  smaller  and  more  homogeneous  schools,  and  higher  correlations, 
but  smaller  samples  per  school.    The  observed  phenomena  is  the  combined 
effect. 

We  now  turn  to  examining  the  distributions  of  design  effects  within 
subgroups  of  a  given  grade.    The  sampling  variability  of  a  subgroup, 
relative  to  the  entire  sample,  depends  on,  among  other  things,  how  that 
subgroup  is  spread  throughout  the  sample  and  what  (weighted)  proportion  of 
the  total  sample  is  accounted  for  by  the  subgroup.    For  example,  the  white 
subgroup  of  the  race/ethnicity  variable  is  fairly  evenly  spread  throughout 
the  sample  auid  accounts  for  more  than  75  percent  of  the  total  sample  (by 
weight)  for  each  age.    Consequently,  the  distribution  of  design  effects  for 
this  subgroup  closely  resembles  that  of  the  total  population. 

The  subgroups  determined  by  sex  anvl  parental  education  are  also  fairly 
evenly  spread  across  the  sample.    In  these  cases,  however,  a  given  subgroup 
is  a  smaller  proportion  of  the  total  population.    Consequently,  any  effects 
of  cluster  selection  (students  within  schools)  on  the  variance  estimates 
should  be  reduced,  relative  to  the  total  population,  because  there  are 
fewer  observations  per  cluster  but  roughly  the  same  number  of  clusters. 
The  result  is  a  tendency  for  the  design  effects  for  these  subgroups  to  be 
somewhat  lower  than  those  for  the  total. 

On  the  other  hand,  the  distributions  of  design  effects  by  region,  while 
roughly  having  the  same  median  as  the  total  sample,  also  have  noticeably 
more  variability  (as  measured  by  the  inter-quartile  range).    This  is 
becai'ise  the  partitioning  of  the  entire  sample  into  regions  occurs  at  the 
PSU  level  and  so  a  PSU  is  either  entirely  included  or  entirely  excluded  in 
the  estimation  of  statistics  at  the  regional  level.    Since  the  PSU  is  the 
level  of  aggregation  used  for  variance  estimation  purposes,  the  estimated 
variances  of  regional  level  statistics  are  based  on  fewer  degrees  of 
freedom  than  are  those  of  national  level  statistics.    Consequently,  the 


549 


566 


sampling  variability  of  the  regional  level  variance  estimates  must  be 
larger. 

Overall,  although  the  distributions  of  design  effects  are  different  by 
subgroup,  they  are,  perhaps,  similar  enough  (at  least  within  a  grade)  to 
select  an  overall  composite  value  which  is  adequate  for  most  purposes. 
Because  Grade  ^  appears  to  have  lower  design  effects  in  general,  it  should 
probably  be  treated  separately. 

In  choosing  a  composite  design  effect,  some  consideration  must  be  made 
about  the  relative  consequences  of  overestimating  the  variance  as  opposed 
to  underestimating  the  variance.    For  example,  adopting  the  position  that 
an  overestimate  of  the  variance  is  as  severe  an  error  as  an  underestimate 
leads  to  using  a  composite  which  is  near  to  the  center  of  the  distributions 
of  the  design  effects.    Possible  composites  of  this  type  are  the  mean  and 
median  design  effects.    In  the  current  data,  the  mean  design  effects  are 
1.5,  1.4  and  1.6  for  Grades  4,  8  and  11,  respectively.    These  are  close  to, 
but  greater  than,  the  median  design  effects;    1.4,  1.3  and  1.4. 

Alternatively,  one  can  adopt  the  position  that  it  is  a  graver  error  to 
underestimate  the  variability  '>f  a  statistic  than  to  overestimate  it.  For 
example,  Johnson  and  King  (1986)  examine  estimation  of  variances  using 
design  effects  (among  other  techniques)  under  assumption  that  the 
consequences  of  an  underestimate  are  three  times  as  severe  as  those  of  an 
overestimate  of  the  same  magnitude.    Assuming  that  the  distribution  of 
design  effects  is  roughly  independent  of  the  jackknife  variance,  so  that 
the  size  of  a  design  effect  does  not  depend  on  the  size  of  the  variance, 
and  adopting  a  loss  function  which  is  a  weighted  sum  of  absolute  values  of 
the  deviations  of  predicted  from  actual  with  underestimates  receiving  three 
times  the  weight  of  overestimates,  produces  the  upper  quartile  of  the 
design  effects  as  the  composite  value.    The  values  of  this  composite,  for 
Grades  4,  8  and  11,  respectively,  are  1.8,  1.6  and  1.8. 


550 


Table  14.2(1) 


Distributions  of  Design  Effects 
for  Demographic  Subgroups 

Gi  4 


Group 

LoO 

Median 

HiO 

May 

Mean 

^  s=  z  •  u 

TOTAL 

1.25 

1 

.54 

1.96 

2.88 

1. 

61 

7S  A 

MALE 

1.18 

1 

.44 

1.69 

2  89 

1. 

45 

OQ  9 
77  .  Z 

FEMALE 

1.13 

1 

.38 

1.67 

2.42 

1. 

41 

9^  7 

inn  n 

WHITE 

1.23 

1 

.55 

1.85 

3.42 

1. 

60 

79  A 

QS  A 

7  J  .  *♦ 

BLACK 

1.13 

1 

.38 

1.72 

2.96 

1. 

45 

70  .  7 

HISPANIC 

1.05 

1 

.46 

1.86 

5.34 

1. 

55 

OKJ  •  Z 

Q'^  Q 
7  J  .  7 

OTHER 

0.90 

1 

.08 

1.38 

3.45 

1. 

18 

96.9 

99.2 

NE 

1.12 

1 

.53 

2.20 

4.66 

1. 

72 

67.2 

84.7 

SE 

1,  JO 

1 

.47 

1.98 

3.90 

1. 

54 

76.3 

89.3 

CENTRAL 

1.05 

1 

.62 

2.27 

4.50 

1. 

71 

64.1 

83.2 

WEST 

0.93 

1 

.30 

1.98 

5.20 

1. 

55 

75.6 

84.0 

<  H.S. 

1.06 

1 

.30 

1.68 

3.01 

1. 

39 

87. C 

99.2 

GRAD  HS 

1.06 

1 

.31 

1.65 

2.43 

1. 

36 

93.1 

100.0 

POST  HS 

1.11 

1 

.38 

1.69 

3.03 

1. 

42 

90.1 

96.5 

UNKNOWN 

1.13 

1 

.37 

1.55 

2.57 

1. 

38 

95.4 

99.2 

RURAL 

1.02 

1 

.37 

1.82 

5.21 

1. 

52 

79.4 

90.8 

LOW  MET 

0.90 

1 

.20 

1.70 

3.59 

1. 

32 

87.8 

95.4 

HI  MET 

1.1? 

1 

.56 

1.95 

4.14 

1. 

63 

75.6 

87.8 

BIG  CITY 

0.88 

1 

.23 

1.68 

3.26 

1. 

33 

84.7 

98.5 

FRINGE 

0.89 

1 

.25 

1.65 

4.08 

1. 

36 

84.7 

93.1 

MED  CITY 

1.10 

1 

.76 

2.30 

5.21 

1. 

*'6 

58.0 

82.4 

SMALL  PL 

1.05 

1 

.40 

1.70 

,.14 

1. 

43 

84.7 

96.9 

551 

56S 


Figure  14.2-la 


GRADE  4 

LOG  BASE  10  OF  DESIGN  EFFECTS 


TOTAL 


MALE 


FEMALE 


WHITE 


BLACK 


HISPANIC 


J  L 


OTHER 


J  I 


1.0  -0.6  -0.2  0.2     0.6  1.0 


LOG10  (DESIGN  EFFECTS) 


552 


569 


Figure  14*2-lb 


GRADE  4 

LOG  BASE  10  OF  DESIGN  EFFECTS 


I  L 


J  L 


1.0   -0.6   -0.2   0.2  0.6 
LOG10  (DESIGN  EFFECTS) 


TOTAL 


NE 


SE 


CENTRAL 


1€ST 


H.S 


GRADH. S 


POSTH..S 


UNKNOWN 


1.0 


553 

;  r 


Figure  14.2-lc 


GRADE  4 

LOG  BASE  10  OF  DESIGN  EFFECTS 


TOTAL 


RURAL 


lOmE'i 


HIMET 


BIGCITY 


FRINGE 


MEDCITY 


SMALLPL 


J  L 


J  L 


J  I 


1.0-0.6-0.20,2     0.6  1.0 
LOG10  (DESIGN  EFFECTS) 


554 


571 


Table  14.2(2) 

Distributions  of  Design  Effects 
for  Demographic  Subgroups 

Grade  8 


Group 

LoQ 

Median 

IliQ 

Max 

Mean 

X  <=  2.0 

X  <=  2.5 

TOTAL 

J  13 

1.36 

1.57 

2.52 

1.39 

92.3 

99.2 

MALE 

1.07 

1.27 

1.52 

2.44 

1.29 

98.5 

100.0 

FEMALE 

1.02 

1.26 

1.51 

2.00 

1.27 

98.5 

100.0 

WHITE 

1.09 

1.32 

1.53 

2.79 

1.32 

94.6 

90-2 

BLACK 

1.08 

1.29 

1.59 

2.74 

1.38 

90.8 

97.7 

HTSPANIC 

0.97 

1.33 

1.87 

4.68 

1.54 

80.0 

86.9 

OTHER 

0.83 

1.13 

1.49 

2.34 

1.16 

97.7 

100.0 

NE 

0.83 

1.23 

1.78 

3.57 

1.38 

78.5 

89.2 

SE 

0.97 

1.34 

1.90 

3.20 

1.43 

80.0 

91.5 

CENTRAL 

0.87 

1.33 

1.83 

4.03 

1.47 

80.0 

87.7 

WEST 

0.86 

X.23 

1.64 

3.61 

1.30 

88.5 

96.2 

<  H.S. 

0.98 

1.23 

1.51 

2.92 

1.27 

93.1 

99.2 

GRAD  HS 

1.07 

1.27 

1.46 

2.18 

1.29 

96.2 

100.0 

POST  HS 

0.96 

1.17 

1.45 

2.70 

1.21 

98.5 

99.2 

UNKNOWN 

0.98 

1.16 

1.38 

2.43 

1.20 

98.5 

100.0 

RURAL 

0.86 

1.39 

1.92 

4.14 

1.45 

77.7 

92.3 

LOW  MET 

0.89 

1.37 

1.90 

4.14 

1.49 

76.9 

88.5 

HI  MET 

0.98 

1.39 

2.05 

5.03 

1.60 

73.8 

86.2 

BIG  CITY 

0.75 

1.12 

1.76 

5.12 

1.35 

80.8 

87.7 

FRINGE 

0.60 

0.94 

1.36 

3.95 

1.10 

89.2 

95.4 

MED  CITY 

1.13 

1.64 

2.18 

4.89 

1.76 

65.4 

81.5 

SMALL  PL 

0.90 

1.26 

1.61 

2.85 

1.31 

87.7 

97.7 

555 


572 


Figure  14.2-2a 


GRADE  8 

LOG  BASE  10  OF  DESIGN  EFFECTS 


TOTAL 


MALE 


FEMALE 


WHITE 


BLACK 


HISPANIC 


J  1  [  L 


-1.0  -0.5  -0.2  0.2  0.5 


OTHER 


1.0 


LOG10  (DESIGN  EFFECTS) 


ERIC 


556 


573 


Figure  lA.2-2b 


GRADE  8 

LOG  BASE  10  OF  DESIGN  EFFECTS 


J. 


1 


m 


J  L 


TOTAL 


HE 


SE 


CENTHAL 


iCST 


H.  S 


GRADH..  S 


POSTH. S 


UNKNOWN 


J  !  I       :  I 


1.0   -0.6   -0.2   0.2     0.6  1.0 
L0G10  (DESIGN  EFFECTS) 


557 

574 


Figure  lA.2-2c 

GRADE  8 

LOG  BASE  10  OF  DESIGN  EFFECTS 


TOTAL 


RURAL 


LOWMET 


HIMET 


BIGCITY 


FRINGE 


MEDCITY 


SMALLPL 


J  L 


1.0  -0.6   -0.2  0.2     C.6  1.0 
LOG10  (DESIGN  EFFECTS) 


:?8 


0  I  D 


Table  14.2(3) 


Distributions  of  Design  Effects 
for  Demographic  Subgroups 

Grade  11 


Group 

LoQ 

Median 

HiQ 

TOTAL 

1.23 

1 

.55 

1.95 

MALE 

1.12 

1 

.32 

1.65 

FEMALE 

1.21 

1 

.46 

1.75 

VHITE 

1.24 

1 

.55 

1.83 

BLACK 

1.13 

1 

.54 

2.04 

HISPANIC 

0.97 

1 

.43 

2.02 

OTHER 

0.94 

1. 

16 

1.40 

NB 

1.35 

1. 

89 

2.65 

SE 

0.96 

1. 

44 

2.02 

CENTRAL 

0.97 

1. 

48 

2.02 

'tfEST 

0.80 

1. 

28 

1.73 

<  H.S. 

1.12 

1. 

35 

1.63 

GRAD  HS 

1.07 

1. 

30 

1.59 

POST  HS 

1.24 

1. 

46 

1.73 

UNKNOWN 

0.96 

1. 

17 

1.47 

RURAL 

1.05 

1. 

42 

1.93 

LOW  MET 

1.12 

1. 

57 

2.37 

HI  MET 

0.99 

1. 

60 

2.29 

BIG  CITY 

0.78 

1. 

09 

1.57 

FRINGE 

0.77 

1. 

01 

1.38 

MED  CITY 

0.79 

1. 

24 

1.78 

SMALL  PL 

1.03 

1. 

36 

1.77 

Max 

Mean 

X  <-  ?  0 

^  ^-  9  S 

*t    \—    C.  •  J 

2.84 

1.63 

75  Q 

Q9  9 

2.80 

1  42 

QO  S 

7 v.  J 

3.65 

1  55 

Q9  9 
7Z  .  Z 

3.24 

1  57 

nA  s 

OH  .  J 

7J.  / 

3.60 

1.70 

71 

O  J  .  0 

3.40 

1.52 

74  1 

0*^  1 
7  J  .  i 

2.60 

1.21 

92.2 

QQ  1 

5.51 

2.10 

53.4 

71.6 

5.42 

1.57 

74.1 

90.5 

4.32 

1.59 

75.0 

87.1 

4.24 

1.37 

81.0 

90.5 

2.42 

1.41 

88.8 

100.0 

2.60 

1.37 

91.4 

99.1 

2.76 

1.52 

86.2 

95.7 

2.53 

1.22 

95.7 

99.1 

6.21 

1.59 

77.6 

92.2 

5.31 

1.85 

65.5 

77.6 

6.54 

1.82 

67.2 

79.3 

3.08 

1.25 

87.1 

94.0 

3.40 

1.12 

93.1 

97.4 

4.29 

1.39 

78.4 

90.5 

2.98 

1.45 

81.9 

93.1 

559 


576 


Figure  14.2-3a 


GRADE  11 
LOG  BASE  10  OF  DESIGN  EFFECTS 


TOTAL 


MALE 


H.  FEMALE 


WHITE 


BLACK 


HISPANIC 


I       I       I  I 


OTHER 


I       I  I 


-0.40  -0.20  -0.00  0.20  0.40  0.60 
LOGIO  (DESIGN  EFFECTS) 


560 


577 


Figure  14.2-3b 


GRADE  11 
LOG  BASE  10  OF  DESIGN  EFFECTS 


I  I  I 


m 


I    I   I 


1.0   -0.6   -0..    'I  2  0.6 
LOG10  (DESIGN  EFFECTS) 


TOTAL 


NE 


SE 


CENTR.M 


WEST 


H.,S 


GRADH. S 


POSTH.S 


UNKNOWN 


1.0 


561 

578 


Figure  1A.2-3C 


GRADE  11 
LOG  BASE  10  OF  DESIGN  EFFECTS 


J  L 


TOTAL 


RURAL 


LOWMET 


HIMET 


BIGCITi 


FRINGE 


MEDCITY 


SMALLPL 


I       I  I  1 


10   -0.6   -0.2   0.2     0.6  1.0 
LOG10  (DESIGN  EFFECTS) 


562 


79 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 


PART  III 


580 


Chapter  15 


ESTIMATBS  OP  THE  READING  AND  WRITING  PROFICIENCY 
OF  AMERICAN  STUDENTS 


Albert  E.  Beaton 
David  S.  Freund 
Bruce  A,  Kaplan 

Educational  Testing  Service 


This  part  of  the  technical  report  presents  estimates  of  the  reading  and 
writing  proficiency  of  students  in  American  schools.  The  first  part  of  this 
report  described  how  the  students  were  selected,  how  they  were  assessed, 
and  how  their  responses  moved  from  assessment  sessions  to  a  carefully 
constructed  data  base,  ready  for  analysis •  The  second  part  described  the 
methods  of  data  analysis,  including  scaling  and  parameter  estimation.  In 
this  third  part,  estimates  of  how  students  are  performing  in  school,  and 
estimates  of  the  sampling  error,  are  presented. 

This  is  a  technical  report  and  is  not  intended  to  be  interpretive. 
Estimates  are  presented,  but  no  attempt  is  made  to  explain  why  the  students 
behaved  in  the  way  that  they  did.  Interpretive  results  are  presented  in 
NAEP  reports  such  as  The  Reading  Report  Card;    Progress  Toward  Excellence 
in  Our  Schools  (1985),  and  Writing;  Trends  Across  the  Decade,  1974-8A 
(Applebee,  Langer,  &  Mullis,  1986a).  We  will  leave  it  to  experts  in  the 
educational  process  to  hypothesize  why  the  results  occurred.  We  have  made 
the  public-use  data  tapes  (Barone,  Norris,  &  Rogers,  1986)  available 
those  who  wish  to  estimate  other  functions  of  student  performance  fron»  the 
NAEP  data  or  to  search  for  possible  explanations  for  the  student 
performance  that  is  reported  here. 

Clearly,  neither  this  report,  nor  any  report,  could  present  all  of  the 
population  estimates  that  are  made  possible  by  the  NAEP  database.  The 
analysis  of  the  1983-84  NAEP  data  has  resulted  in  the  production  of  many 
thousands  of  tables  containing  estimates  of  the  proficiency  of  students, 
and  various  subgroups  of  students,  in  American  schools.  These  tables  have 
been  bound  in  books  called  almanacs;  the  contents  of  24  such  almanacs  are 
described  in  Chapter  13.4.  We  have  selected  a  few  of  the  most  basic  tables 
for  presentation  here.  In  addition,  some  tables  that  are  not  included  in 
almanacs  are  presented. 

The  technical  details  of  the  estimation  process  which  underlies  these 
tables  are  covered  in  the  previous  parts  of  this  report  and  not  repeated 
here.  For  a  detailed  description  of  how  to  read  and  use  the  tables  selected 


565 


581 


from  the  almanacs  the  reader  should  refer  to  Chapter  13, A,  The 
computational  procedures  for  other  tables  will  be  noted  as  needed. 


15.1    Population  Estimates 

The  NAEP  Year  15  data  includes  a  number  of  different  samples  from  which 
population  estimates  can  be  made,  and  Westat,  Inc.  has  developed  an 
appropriate  set  of  sampling  weights  for  the  students  in  each  sample.  All 
estimates  of  population  parameters  use  these  sampling  weights. 

Table  15(1)  shows  the  sizes  of  the  various  samples  and  the  sums  of 
their  sampling  weights  by  grade/age  combination.  The  sums  of  the  weights 
for  the  spiral  samples,  which  are  by  far  the  largest,  estimate  the  numbers 
of  students  who  are  in  each  grade/age  combination  and  who  would  be 
assessable.  The  sums  of  the  weights  of  the  excludeT^tudents  estimate  the 
numbers  of  students  in  each  grade/age  combination  who,  in  their  schools' 
judgment,  would  not  be  assessable.  The  sums  of  the  estimates  for  the  spiral 
sample  and  for  the  excluded  sample  are  estimates  of  the  total  number  of  in- 
school  students  in  the  grade/age  combination. 

The  four  tape  samples  are  defined  by  age  only,  and  each  sum  of  weights 
is  an  estimate  of  the  number  of  age-eligible  students  in  an  age  category 
who,  in  their  school's  judgment,  would  be  assessable.  These  weight  sums  can 
be  added  to  the  sum  of  weights  of  the  age-eligible  excluded  students  to 
make  an  estimate  of  the  number  of  students  in  an  age  category.  The 
differences  in  the  estimates  from  the  four  tape  samples  are  due  to  sampling 
error. 

In  most  cases,  the  number  of  students  in  a  grade/age  combination  is  not 
of  interest;  a  researcher  will  be  interested  in  estimating  the  number  of 
students  at  either  a  grade  or  an  age.    An  estimate  of  the  number  of 
students  at  an  age  level  can  be  made  by  summing  the  weights  of  only  the 
age-eligible  students,  and  an  estimate  of  the  number  of  students  in  a  grade 
by  summing  the  weights  of  grade-eligible  students. 

Table  15(2)  shows  how  many  students  at  each  grade  level  are  at,  in,  or 
above  the  modal  age  for  that  grade,  and  how  many  at  each  age  level  are  at, 
in,  or  above  the  modal  grade  for  that  age.  These  figures  were  computed  from 
the  spiral  sample  only.  Along  with  the  counts  from  this  sample,  the  sum  of 
the  weights  (Weighted  N)  for  each  category  is  presented,  and  these  sums  are 
estimates  of  the  numbers  of  students  in  these  categories  in  the  population. 
The  standard  errors  of  these  estimates  and  coefficients  of  variation  are 
also  given. 

Tables  15(3),  15(A),  and  15(5)  present  estimates  of  che  number  of 
students  in  various  subpopulations  who  could  have  been  assessed  for  each  of 
the  grade/age  combinations.  These  estimates  were  made  from  the  spiral 
sample.  Separate  estimates  are  shown  for  the  different  sexes,  racial/ethnic 
groupings,  regions  of  the  country,  levels  of  parental  education,  and  sizes 
and  types  of  community.  Estimates  are  made  separately  for  age-eligible  and 
grade-eligible  students  as  well  as  for  students  who  are  eligible  by  both 


566 


age  and  grade  and  those  who  are  eligible  by  either  age  or  grade.  The  actual 
i{,W),  Z{11)  and  2(12)  of  Chapter  2. 

c^nnJ^J^r  ^^^V'  ^^^V'        ^^^^^  P'^*^*"^  estimates  of  the  number  of 
students  in  various  subpopulations  who,  in  their  schools'  judgment,  were 
unassessable.  Separate  estimates  are  also  shown  for  the  different  ^exes? 
racial/ethnic  groupings,  regions  of  the  country,  classes  of  parental 
education,  and  sizes  and  types  of  community.  Also,  estimates  are  made 
separately  for  age-eligible  and  grade-eligible  students  as  well  as  for 
students  who  are  eligible  by  both  age  and  grade  and  those  who  are  eHgible 
oL^l     ^f^*  -  V^^^'  °^        estimates  from  these  tables  and  the 

corresponding  estimates  from  the  previous  tables  are  estimates  of  the  total 

Tnl^f  °Ll     T'^       u"  ^"^^^^  "^^'^^"^  they  were  assessable 

or  not.  The  actual  numbers  of  students  used  in  making  these  estimates  are 
shown  in  Tables  2(13),  2(14)  and  2(15)  of  Chapter  2.  estimates  are 

.c«Jf  1?^  ^l^V'  ^''"^  estimates  of  the  numbers  of 

mSffrJ  «t"de"t\at  different  age  levels.  There  are  four  estimates,  one 
JoJ  tJf  °^        """^         '^l"P^^^-  Separate  estimates  are  also  kown 

lyLlct  Ji"^""    r'^f'  racial/ethnic  groupings,  regions  of  the  country, 
classes  of  parental  education,  and  sizes  and  types  of  community.  The 
average  of  the  four  estimates  is  also  an  estimator  of  the  population  size 
These  samples  cannot  be  used  to  estirate  grade  populations.  The  actual 
numbers  ox  students  used  in  making  these  estimates  are  shown  in  Tables 
2(16),  2(17)  and  2(18)  of  Chapter  2. 

.nH  It^ll  ^"^  ^^^^""^  P'^"^"^^^  i^^^  the  spiral  sample 

u  J  t  included  as  background  for  the  tables  that  follow.  These  tables 
show  the  actual  numbers  of  students  for  whom  plausible  values  were  computed 
and  population  estimates  of  the  numbers  who  would  have  had  plausible  values 
if  an  educational  census  of  the  entire  country  were  done  using  this  NAEP 
design.  The  design  for  NAEP  called  for  some  students  to  be  assigned  reading 
P?!;^  m!'  ^-^^^^"S  exercises,  and  some  assigned  Jo?h  ^ 

llTal/  ^""^  reading  were  computed  for  only  those  students  who  were 

assigned  reading  exercises  that  were  used  in  the  scaling  process,  and 
l\J,l,  J;l     ^  ""^J^"^  plausible  values.  This  table  reinforces  th^  fact  that 
plausible  values  for  writing  were  computed  for  only  those    students  who 
were  eligible  in  the  grade  samples,  not  the  age  samples. 

The  following  tables  present  reading  f.nd  writing  proficiency  estimates. 

*  Table  15(15)  displays  reading  proficiency  estimates 
for  fourth  graders. 

*  Table  15(16)  displays  reading  proficiency  estimates 
for  eighth  graders. 

*  Table  15(17)  displays  reading  proficiency  estimates 
for  eleventh  graders. 


567 


583 


*  Table  15(18)  displays  writing  proficiency  estimates 
for  fourth  graders. 

*  Table  15(19)  displays  writing  proficiency  estimates 
for  eighth  graders. 

*  Table  15(20)  displays  writing  proficiency  estimates 
for  eleventh  graders. 

Population  estimates  are  presented  for  students  of  specified  grade 
levels  only  (I.e.,  grades  A,  8,  and  11).  These  tables  also  contain  separate 
proficiency  estimates  for  the  same  selected  population  subgroups  as  In  the 
preceding  tables  but  are  restricted  to  the  students  In  the  specified 
grades.  In  particular,  breakdowns  by  age  within  grade  are  given.  Since  the 
students  of  a  given  grade  can  be  of  any  age.  It  Is  Important  to  note  that 
some  of  these  age  estimates,  which  are  conditional  on  grade  placement,  are 
made  from  small  samples  and  none  of  these  estimates  are  appropriate  for 
estimating  the  proficiency  of  the  entire  population  at  an  age  level. 

For  all  assessable  students  In  a  grade,  and  for  each  subgroup,  these 
tables  contain  the  actual  sample  sizes  used  In  computing  the  estimates  and 
the  sum  of  the  weights  (Weighted  N)  for  those  samples.  In  these  tables,  the 
Weighted  N  will  be  of  little  Interest,  but  the  coefficient  of  variation, 
which  Is  In  parentheses  next  to  the  Welghteu  N,  Is  a  measure  of  the 
variability  of  the  estimates  of  the  standard  errors  and  thus  of  Importance 
In  judging  the  adequacy  of  the  population  estimates.  Large  coefficients  of 
variation  (exceeding  20  percent)  are  emphasized  with  an  exclamation  point 
(!)  In  the  tables. 

Next,  for  each  subgroup  and  the  total,  the  tables  contain  estimates  of 
mean  values,  standard  devlatlons\  and  10th,  25th,  50th  (median),  75th,  and 
90th  percentiles.  Each  of  these  statistics  is  followed  by  an  estimate  of 
its  standard  error,  which  is  in  parentheses. 

Tables  15(21)  through  15(69)  report  either  estimated  average  values  or 
percent s  below  anchor  points  on  the  reading  or  writing  scales.  These 
tabler,  which  have  been  selected  from  several  proficiency  almanacs,  are 
based  on  the  spiral  grade-eligible  sample  only.  A  detailed  discussion  of 
how  these  tables  were  constructed  and  how  to  read  and  use  these  tables  is 
presented  in  Chapter  13. 


^These  standard  deviations  were  computed  as  follows:  (1)  for  each  of 
five  plausible  values,  a  weighted  analogue  of  the  conventional  sample 
variance  was  computed;  (2)  the  square  roots  of  each  of  these  estimates  was 
taken;  and  (3)  the  average  of  the  five  values  in  (2)  was  obtained.  The 
weighted  variances  computed  in  (1)  do  not  take  into  account  the  sample 
design  (see  Tepping  &  Hansen,  1984).  However,  attempts  to  implement  an 
adjustment  did  not  lead  to  satisfactory  results  (Zwick,  1985). 


568 

c/  0  4 


There  are  six  sets  of  eight  tables  each: 

Tables  15(21)  through  15(28)  contain  reading 
proficiency  values  for  fourth  graders. 

Tables  15(29)  through  15(36)  contain  reading 
proficiency  values  for  eighth  graders. 

Tables  15(37)  through  15(44)  contain  reading 
proficiency  values  for  eleventh  graders < 

Tables  15(45)  through  15(52)  contain  writing 
proficiency  values  for  fourth  graders. 

Tables  15(53)  through  15(60)  contain  writing 
proficiency  values  for  eighth  graders. 

•     Tables  15(61)  through  15(68)  contain  writing 
proficiency  values  for  eleventh  graders. 

These  tables  contain  separate  estimates  for  the  different  sexest 
racial/ethnic  groups,  levels  of  parental  education,  and  ages  of  the 
students  who  were  in  those  grades  .  These  groups  are  sunetimes  referred  to 
as  reporting  subgroups  and  are  defined  in  Chapter  12.  In  some  tables,  some 
age  groups  were  so  small  that  estimates  were  not  reported. 

Each  set  of  tables  contains: 

(1)  Estimates  of  the  average  performance  of  the  total  grade  and 
for  each  reporting  group. 

(2)  Estimates  of  the  average  performance  of  the  males  and  females 
in  each  reporting  group. 

(3)  Estimates  of  the  average  performance  by  racial/ethnic 
grouping  (Vhites,  Blacks,  Hispanics,  American  Indians, 
Asians,  and  Unclassified)  in  each  reporting  group. 


The  e<:;imates  of  subgroup  reading  proficiency  reported  in  Tables 
15(21)  through  15(44)  are  occasionally  different  from  the  corresponding 
estimates  reported  in  Tables  15(15)  through  15(17),  although  the  difference 
is  always  trivial.    The  discrepancy  occurs  b€:cause  the  estimates  of 
performance  in  Tables  15(21)  through  15(44)  are  based  on  a  single  set  of 
plausible  values,  while  the  estimates  in  Tables  15(15)  through  15(17)  are 
based  on  five  sets  of  plausible  values.    Vhilf  joth  sets  of  estimates  have 
equal  expectations,  the  estimates  based  on  th^  average  of  five  separate 
estimates  are  less  variable.    Both  sets  of  estimates  used  the  same 
estimates  of  sampling  variability,  which  includes  a  component  of 
variability  of  estimate  across  sets  of  plausible  values. 


569 


585 


(4)  EstiMtes  of  the  average  perforiiance      the  students  from 
various  regions  of  the  country  (Northeast,  Southeast, 
Central,  and  Western)  in  each  reporting  group. 

(5)  EstiMtes  of  the  average  perforMnce  of  students  at  various 
age  levels  (conditional  on  grade)  within  each  reporting 
group* 

(6)  EstiMtes  of  the  average  perforMnce  of  students  fron 
different  sizes  and  types  of  comunities  (Rural, 
Disadvantaged  Urban,  Advantaged  Urban,  Other  Big  City,  Fringe 
of  Big  City,  Hediua  Cities,  and  SmII  Cities  vithin  each 
reporting  group)  vithin  each  reporting  group* 

(7)  EstiMtes  of  the  average  perforMnce  of  students  by  self- 
reported  parents'  education  (Did  Not  Graduate  High  School, 
Did  Graduate  High  School,  Had  Som  Post-High  School 
Education,  or  Unlcnovn)  vithin  each  reporting  group* 

(8)  EstiMtes  of  the  percents  above  selected  anchor  points  for 
each  reporting  group* 

Along  vith  each  estiMted  proficiency  value  is  the  estiMted  proportion 
of  the  population  comprising  that  category*  For  example,  the  TOTAL  line  of 
Table  15(22)  shovs,  among  other  things,  that  the  estiMted  average  reading 
proficiency  score  of  Mle  fourth-graders  is  215*1  and  that  Mies  are 
estiMted  to  constitute  49*8  percent  of  the  fourth-grade  population*  Bach 
estiMted  average  value  and  percent  is  accompanied  by  its  standard  error* 

These  tables  contain  some  redundancy;  for  example,  sex  is  used  both  as 
a  reporting  variable  and  to  classify  the  students  vithin  reporting 
variables*  In  these  cases,  the  logically  impossible  categories  in  the 
tables  are  replaced  by  ****** 


570 


ERLC 


086 


Table  15(1) 


Number  of  Students  by  Grade/Age  Combination 
and  by  Type  of  Assessment 


Grade  4/Age  9 
Sum  of 
Count  Veights 

Grade  8 /Age  13 
Sun  of 
Count  Veights 

Grade  Ix/Age  17 
Sun  of 
Count  Veights 

SPIRAL 

26087 

3971749 

28405 

4363428 

28861 

4045041 

TAPE  1 

U03 

3122045 

1310 

33iOi55 

1539 

3048025 

TAPE  2 

1356 

3005769 

1276 

3348263 

1540 

3045283 

TAPE  3 

1389 

3087985 

1283 

3338947 

1596 

2978520 

TAPE  A 

13*^ 

3100713 

1289 

3340216 

1534 

3026687 

EXCLUDED 

1416 

171436 

1448 

179054 

1361 

115162 

571 


ERIC 


587 


Table  15(2) 

Number  of  Spiral-Assessed  Students  by  Grade/Age 


Grade  A/Age  9 


AGE 

<  9 

>  9 

TOTAL 

GRADE  <  4 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
V/UKrr*  Ur  VAiv. 

0 
0 

5917 
761524 

1.06 

0 

U 

5917 

800Q 
1.06 

GRADE  »  4 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

158 

91  SOA 

2395 
11.14 

12953 

9067 
0.39 

6984 
883198 
7089 
0.80 

20095 
3200290 
14047 
0.44 

GRADE  >  A 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

0 
0 

75 
9936 
2163 
21.77 

0 
0 

75 
9936 
2163 
21.77 

GRADE  TOTAL 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

158 
21504 
2395 
11.14 

18945 
3067047 
13693 
0.45 

6984 
883198 
7089 
0.80 

26087 
3971749 
18935 
0.48 

572 

as8 


Table  15(2) 
(continued) 

Number  of  Spiral-Assessed  Students  by  Grade/Age 


Grade  8/Age  13 


AGE 

<  13 

=  13 

>  13 

TOTAL 

GRADE  <  8 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

u 
0 

6495 
1034711 
6087 
0.59 

0 
0 

6495 
1034711 
6087 
0.59 

GRADE  =  8 

UNWEIGHTED  N 

STANDARD  ERROR 
COEFF.  OF  VAR. 

184 
25662 
4070 
15.86 

14515 
2269841 
4025 
0.18 

7151 
1018446 
6860 
0.67 

21850 
3313949 
7824 
0.24 

GRADE  >  8 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

0 
0 

60 
14769 
4144 
28.06 

0 
0 

60 
14769 
4144 
28.06 

GRADE  TOTAL 

UNWEIGHTED  N 
WEIGHTED  N 
STANDARD  ERROR 
COEFF.  OF  VAR. 

184 
25662 
4070 
15.86 

21070 
3319320 
8088 
0.24 

7151 
1018446 
6860 
0.67 

28405 
4363428 
11281 
0.26 

573 

589 


Table  15(2) 
(continued) 


Number  of  Spiral-Assessed  Students  by  Grade/Age 


Grade  11/Age  17 


AGE 

<  17 

=  17 

>  17 

TOTAL 

GRADE  <  11 

UNWEIGHTED  N 

0 

4129 

0 

4129 

VEIGHTED  N 

0 

671683 

0 

671683 

STANDARD  ERROR 

- 

21099 

- 

21099 

COEFF.  OF  VAR. 

- 

3.14 

— 

3.14 

GRADE  =  11 

UNw clbn 1 bU  n 

2386 

16787 

3692 

22865 

VEIGHTED  N 

399289 

2037738 

635595 

3072622 

STANDARD  ERROR 

22106 

3439 

21691 

6816 

COEFF.  OF  VAR. 

5.54 

0.17 

3.41 

0.22 

GRADE  >  11 

UNWEIGHTED  N 

0 

1867 

0 

1867 

VEIGHTED  N 

0 

300736 

0 

300736 

STANDARD  ERROR 

20072 

20072 

COEFF.  OF  VAR. 

6.67 

6.67 

GRADE  TOTAL 

UNWEIGHTED  N 

2386 

22783 

3692 

28861 

VEIGHTED  N 

399289 

3010157 

635595 

4045041 

STANDARD  ERROR 

22106 

8221 

21691 

11104 

COEFF.  OF  VAR. 

5.54 

0.27 

3.41 

0.27 

574 


1^ 


530 


Table  15(3) 


Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Assessment 


Grade  4/Age  9 


ELIGIBLE         ELIGIBLE         ELIGIBLE  ELIGIBLE 
BY  BY  BY  BY 

AGE  GRADE       AGE  &  GRADE    AGE  OR  GRADE 


TOTAL 


3067047 


3200290 


2295588 


39717A9 


SEX: 
MALE 
FEMALE 


1507250 
1559795 


1594774 
1605515 


1073745 
1221842 


2028280 
1943468 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


2173213 
442833 
359832 
91168 


2260759 
486025 
362177 
91328 


1664721 
317635 
247662 
65570 


2769252 
611224 
474347 
116926 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


675001 
727163 
822401 
842481 


714539 
762456 
851344 
871951 


513833 
542387 
618875 
620492 


875707 
947232 
1054869 
1093939 


PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL  165907  190429  115449  240887 

HIGH  SCHOOL  591450  641628  447221  785857 

GREATER  THAN  HIGH  SCHOOL  1137477  1211729  906596  1442611 

UNKNOWN  1172211  1156503  826322  1502393 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  197666  206921  142139  262448 

DISADVANTAGED  URBAN  373042  401687  280907  493821 

ADVANTAGED  URBAN  427890  454718  349044  533565 

BIG  CITY  242164  239410  179708  301866 

FRINGE  340643  353975  257068  437549 

MEDIUM  501314  515287  363553  653048 

SMALL  984328  1028292  723168  1289451 


575 


591 


Table  15(A) 


Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Spiral  Assessment 

Grade  8/Age  13 


ELIGIBLE         ELIGIBLE         ELIGIBLE  ELIGIBLE 
BY  BY  BY  BY 

AGE  GRADE       AGE  &  GRADE    AGE  OR  GRADE 


TOTAL 


3319320 


3313949 


2269841 


4363428 


SEX: 
MALE 
FEMALE 


1672496 
1646719 


1664184 
1649513 


1067848 
1201887 


2268831 
2094346 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


2462505 
470115 
291061 
95639 


2447689 
483370 
290652 
92238 


1758484 
281652 
161122 
68583 


3151710 
671833 
420591 
119294 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


760547 
766165 
890588 
902019 


757155 
776671 
878077 
902046 


522568 

611431 
612383 


995134 
1019378 
1157234 
1191681 


PARENTS  ED : 

LESS  THAN  HIGH  SCHOOL  286535  312820  164915  434441 

HIGH  SCHOOLS  1164547  1170291  792415  1542423 

GREATER  THAN  HIGH  SCHOOL  1512740  1512803  1125597  1899946 

UNKNOWN  355497  318034  186913  486618 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  174221  176480  117124  233576 

DISADVANTAGED  URBAN  297675  294289  182362  409602 

ADVANTAGED  URBAN  356948  352477  275203  434222 

BIG  CITY  359003  346473  244980  460496 

FRINGE  543237  549813  381908  711142 

MEDIUM  484395  499186  323148  660433 

SMALL  1103841  1095230  745115  1453957 


576 


592 


Table  15(5) 


Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Spiral  Assessment 


Grade  il/Age  17 


ELIGIBLE         ELIGIBLE         ELIGIBLE  ELIGIBLE 
BY  BY  BY  BY 

AGE  GRADE       AGE  &  GRADE    AGE  OR  GRADE 


TOTAL 

SEX: 
HALE 
FEMALE 


3010157 

1532749 
1477134 


3072622 

1548465 
1524155 


2037738 

979337 
1058401 


4045041 

2101878 
1942889 


RACE: 
VHITE 
BLACK 
HISPANIC 
OTHER 


2259880 
426035 
242188 
82053 


2284441 
458544 
245739 
83897 


1602266 
248863 
135384 
51224 


2942055 
635716 
352543 
114726 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
VEST 


734255 
669005 
816774 
790123 


754812 
675109 
840587 
802113 


482051 
418403 
597192 
540092 


1007016 
925711 
1060168 
1052144 


PARENTS  EC: 

LESS  THAN  HIGH  SCHOOL  351212  350297  192273  509236 

HIGH  SCHOOL  1041252  1031742  688507  1384486 

GREATER  THAN  HIGH  SCHOOL  1491560  1567083  1090224  1969219 

UNKNOWN  126133  122699  66733  182098 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  153730  166812  107855  212686 

DISADVANTAGED  URBAN  308175  321553  163095  466633 

ADVANTAGED  URBAN  482440  506711  345115  644037 

BIG  CITY  266997  270523  169867  367652 

FRINGE  327360  316221  225178  418403 

MEDIUM  493442  513958  355848  651551 

SMALL  978013  976844  670779  1284078 


577 


593 


Table  15(6) 


Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Spiral  Assessment 
Who  Would  Be  Deemed  Unassessable  by  Their  Schools 


Grade  4 /Age  9 


ELIGIBLE         ELIGIBLE         ELIGIBLE  ELIGIBLE 
BY  BY  BY  BY 

AGE  GRADE       AGE  &  GRADE    AGE  OR  GRADE 


TOTAL 


107871 


105233 


41668 


171436 


SEX: 
MALE 
FEMALE 


68743 
38971 


66881 
37895 


25622 
15889 


110002 
60977 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


50802 
13227 
32288 
11554 


47171 
14178 
32882 
11002 


17982 
5343 

13966 
4377 


79991 
22062 
51204 
18178 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


23083 
24597 
21528 
38663 


21263 
20449 
22964 
40557 


10793 
7366 
8042 

15468 


33553 
37680 
36451 
63752 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  4784 

DISADVANTAGED  URBAN  21558 

ADVANTAGED  URBAN  11483 

BIG  CITY  8818 

FRINGE  15320 

MEDIUM  18129 

SHALL  27778 


8270 
22815 
10004 

5538 
16117 
16158 
26331 


1301 
8568 
6102 
3531 
8775 
4383 
9009 


11752 
35805 
15385 
10826 
22662 
29905 
45100 


578 


594 


Table  15(7) 


Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Spiral  Assessment 
Who  Would  Be  Deemed  Unassessable  by  Their  Schools 


Grade  8/Age  13 


TOTAL 


ELIGIBLE 
BY 
AGE 


101041 


ELIGIBLE 
BY 
GRADE 


ELIGIBLE 
BY 

AGE  &  GRADE 


ELIGIBLE 
BY 

AGE  OR  GRADE 


116028 


38014 


179054 


SEX: 
MALE 
FEMALE 


65191 
35202 


72953 
42817 


23634 
14310 


114510 
63709 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


52073 
19199 
18922 
10847 


64039 
20180 
20775 
11034 


20360 
6138 
7312 
4206 


95752 
33241 
32385 
17676 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


20033 
21251 
31571 
28185 


22252 
27801 
37884 
28091 


8419 
6604 
12239 
10752 


33866 
42449 
57216 
45523 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  3273 

DISADVANTAGED  URBAN  18243 

ADVANTAGED  URBAN  6287 

BIG  CITY  13994 

FRINGE  12514 

MEDIUM  14764 

SMALL  31966 


6830 
20111 

6721 
14529 
11733 
19358 
36746 


1448 
8613 
1686 
5683 
5026 
3394 
12165 


8655 
29742 
11321 
22840 
19221 
30728 
56548 


579 


595 


1 


Table  15(8) 

Estimated  Total  Number  of  Students 
in  the  Population  Eligible  for  Spiral  Assessment 
Vho  Would  Be  Deemed  Unassessable  by  Their  Schools 


Grade  11/Age  17 


ELIGIBLE 
BY 
AGE 


ELIGIBLE 
BY 
GRADE 


ELIGIBLE 
BY 

AGE  &  GRADE 


ELIGIBLE 
BY 

AGE  OR  GRADE 


TOTAL 


74451 


65829 


25119 


115162 


SEX: 
MALE 
FEMALE 


47857 
26513 


41745 
23983 


16471 
8647 


73131 
41849 


RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 


30787 
15712 
17957 
9996 


29275 
11073 
15626 
9855 


11238 
3752 
7159 
2970 


48824 
23034 
26424 
16881 


REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 


11246 
18700 
19661 
24844 


10482 
14775 
19786 
20787 


3689 
5540 
6157 
9732 


18039 
27935 
33290 
35899 


SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL  3158 

DISADVANTAGED  URBAN  18443 

ADVANTAGED  URBAN  5138 

BIG  CITY  8434 

FRINGE  5969 

MEDIUM  13388 

SHALL  19921 


3107 
13928 
5526 
6976 
6398 
12010 
17885 


1201 
6433 
2267 
2521 
2246 
4096 
6354 


5064 
25938 

8396 
12889 
10121 
21302 
31452 


580 


96 


Table  15(9) 


Estimated  Total  Number  of  Students 
Who  Are  Eligible  for  Assessment  by  Tape  Sample 


Age  9 


TAPP  1 

TAPE  J 

TAPE  4 

TOTAL 

3122045 

3005769 

3087985 

3100713 

SEX: 

MALE 
FEMALE 

1580474 

K'^8825 

1  ^  /.  "I  D  0  O 

1540155 

1502897 
1597815 

RACE: 

WHITE 
BLACK 
HISPANIC 
OTHER 

2232047 
446612 
331819 
111566 

2156800 
434192 

"^1 1099 

103755 

2201732 
454103 

108873 

2214269 
451488 

0  9  0  /,  T  0 

/Z 

111484 

REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 

672037 
778707 
787159 
884142 

660320 
674785 
900433 
770231 

678638 
774924 
8412^^1 
7932P.1 

648949 
871961 
768386 
811416 

PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

186493 
541633 
1279406 
1114511 

145728 
613470 
1.090678 
1155892 

178006 
607155 
1179219 
1123604 

229364 
645405 
1097028 
1128914 

SIZE  AND  TYPE  OF  COMMUNITY: 

RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 

319513 
317826 
532838 
270103 
266304 
603469 
811991 

241335 
388177 
464262 
117433 
355329 
241238 
1197994 

237176 
508894 
423289 
366492 
85568 
412820 
1053744 

115983 
447247 
250761 
319272 
243818 
236180 
1487452 

581 


ERIC 


597 


Table  15(10) 


Estimated  Total  Number  of  Students 
Vho  Are  Eligible  for  Assessment  by  Tape  Sample 


Age  13 


TAPE  1 

TAPE  2 

TAPE  3 

TAPE  4 

TOTAL 

3310355 

3348263 

3338947 

3340216 

SEX: 
MALE 
FEMALE 

1747611 
1562742 

1677070 
1669346 

1629646 
1709301 

1785519 
1554697 

RACE: 
WHITE 
BUCK 
HISPANIC 
OTHER 

2470449 
459418 
289113 
91374 

2501335 
469832 
282776 
94320 

2476403 
480222 
292047 
90270 

2478442 
467216 
299025 
95533 

REGION: 
NORTHEAr 
SOUTHEAoi 
CENTRAL 
WEST 

761536 
870751 
856550 
821518 

690015 
955525 
841877 
860846 

731394 
804766 
79A101 
1008687 

733441 
870643 
835296 
900836 

PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

307983 
1226582 
1494458 

281330 

210519 
1161261 
1575838 

400644 

330924 
1283853 
1431932 

292237 

284487 
1257443 
1489634 

308651 

SIZE  AND  TYPE  OF  COMMUNITY: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGED  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 

257086 
352010 
390384 
294904 
588801 
470814 
956356 

166458 
178032 
201802 
368039 
659332 
648954 
1125646 

145597 
361847 
301008 
21U71 
721578 
481058 
1116388 

398820 
165746 
316831 
344823 
578886 
670936 
864175 

582 


ERIC 


598 


Table  15(11) 


Estimated  Total  Number  of  Students 
Vho  Are  Eligible  for  Assessment  by  Tape  Sample 


Age  17 


TAPE  1 

TAPE  2 

TAPE  3 

TAPE  4 

TOTAL 

3048025 

3045283 

2978520 

3026687 

SEX: 
HALE 
FEMALE 

1618433 
1429591 

1583931 
1461350 

1519193 
1459326 

1485595 
1541091 

RACE: 
WHITE 
BLACK 
HISPANIC 
OTHER 

2286809 
419084 
255867 
86264 

2277417 
434916 
248980 
83970 

2252712 
425528 
212199 
88082 

2261915 
441416 
247841 
75514 

REGION: 

NORTHEAST 
SOUTHEAST 
CENTRAL 
VEST 

739136 
733342 
784021 
791526 

717514 
823804 
723271 
780695 

647244 
780987 
758629 
791660 

767227 
704032 
823867 
731561 

PARENTS  ED: 

LESS  THAN  HIGH  SCHOOL 
HIGH  SCHOOL 

GREATER  THAN  HIGH  SCHOOL 
UNKNOWN 

359225 
1072167 
1500148 

116484 

361064 
1069930 
1450475 

163813 

360424 
1055548 
1476708 
85840 

313972 
1075779 
1543271 
93664 

SIZE  AND  TYPE  OF  COMMUNITY: 
RURAL 

DISADVANTAGED  URBAN 

ADVANTAGEC  URBAN 

BIG  CITY 

FRINGE 

MEDIUM 

SMALL 

184826 
356393 
534485 
133848 
334566 
582749 
921158 

75975 
299455 
509189 
225307 
223363 
504779 
1207215 

101252 
254331 
296041 
402884 
364522 
510419 
1049070 

217805 
301164 
528053 
215027 
287883 
524017 
952738 

583 


599 


Table  lS(12)a 


Number  of  Students  Receiving  Reading  and  Writing 
Items  and  Plausible  Values 


Grade  4 /Age  9 


ELIGIBLE  BY    ELIGIBLE  BY     ELIGIBLE  BY     ELIGIBLE  BY 
AGE  GRADE        AGE  AND  GRADE    AGE  OR  GRADE 


TOTAL 


18945 


20095 


12953 


26087 


STUDENTS  WITH  FADING: 
ITEMS 

PLAUSIBLE  VALUES 


18497 
16799 


19637 
17840 


12660 
11507 


25474 
23132 


STUDENTS  WITH  WRITING: 
ITEMS 

PUUSIBLE  VALUES 


16025 
5795 


16987 
8807 


10986 
5795 


22026 
8807 


584 


Table  lS(12)b 


Weighted  Counts  of  Students  Receiving 
Reading  and  Vriting  Items  and  Plausible  Values 


Grade  4/Age  9 


TOTAL 


ELIGIBLE  BY    ELIGIBLE  BY     ELIGIBLE  BY     ELIGIBLE  BY 
AGE  GRADE        AGE  AND  GRADE    AGE  OR  GRADE 


3067047 


3200200 


2295588 


3971749 


STUDBHTS  VITH  READING: 

ITEMS  2991059  3125304  2240814  3875548 

PUUSIBLB  VALUES  2710595  2837712  2031707  3516600 

STUDENTS  VITH  WRITING: 

ITEMS  2600057  2709078  1951487  3357648 

PUUSIBLE  VALUES  1026813  1408047  1026813  1408047 


585 


ERIC 


601 


Table  15(13)a 

Number  of  Students  Receiving  Heading  and  Writing 
Items  and  Plausible  Values 


Grade  8/Age  13 


ELIGIBLE  BY    ELIGIBLE  BY      ELIGIBLE  BY      ELIGIBLE  BY 
AGE  GRADE        AGE  AND  GRADE    AGE  OR  GRADE 


TOTAL 


21070 


STUDENTS  WITH  READING: 

ITEMS  20568 
PLAUSIBLE  VALUES  17535 


STUDENTS  WITH  WRITING: 
ITEMS 

PLAUSIBLE  VALUES 


17810 
7420 


21850 


14515 


28405 


2132'* 
18173 


14173 
12043 


27719 
23665 


'•^498 
li092 


12289 
7420 


24019 
11092 


586 


Table  15(13)b 


Weighted  Counts  of  Students  Receiving 
Reading  and  Writing  Items  and  Plausible  Values 


Grade  8/Age  13 


ELIGIBLE  BY    ELIGIBLE  BY     ELIGIBLE  BY     ELIGIBLE  BY 
AGE  GRADE        AGE  AND  GRADE    AGE  OR  GRADE 


TOTAL 


3319320  3313949 


2269841 


4363428 


STUDENTS  WITH  READING: 

ITEMS  3241767  3235027  2217367 

PLAUSIBLE  VALUES  2763807  2761459  1888098 


4259426 
3637168 


STUDENTS  WITH  WRITING: 

ITEMS  2805295  2802901  1918064 

PLAUSIBLE  VALUES  1155803  1682192  1155803 


3690132 
1682192 


587 

603 


Table  15(U)a 


Number  of  Students  Receiving  Reading  and  Writing 
Items  and  Plausible  Values 


Grade  11/Age  17 


ELIGIBLE  BY    ELIGIBLE  BY      ELIGIBLE  BY      ELIGIBLE  BY 
AGE  GRADE         AGE  AND  GRADE    AGE  OR  GRADE 


TOTAL 


22783 


22865 


16787 


28861 


STUDENTS  WITH  READING: 
ITEMS 

PLAUSIBLE  VALUES 


22226 
18984 


22325 
19080 


16381 
14009 


28170 
24055 


STUDENTS  WITH  WRITING: 
ITEMS 

PLAUSIBLE  VALUES 


19267 
7919 


19367 
10657 


14219 
7919 


24415 
10657 


588 

604 


Table  15(U)b 


Weighted  Counts  of  Students  Receiving 
Reading  and  Writing  Items  and  Plausible  Values 


Grade  11/Age  17 


ELIGIBLE  BY    ELIGIBLE  BY      ELIGIBLE  BY      ELIGIBLE  BY 
AGE  GRADE        AGE  AND  GRADE    AGE  OK  GRADE 


TOTAL 


3010157 


3072622 


2037738 


4045040 


STUDENTS  WITH  READING: 

ITEMS  2936218  2999803 

PLAUSIBLE  VALUES  2509020  2563822 


1987958 
1699683 


3948063 
3373158 


STUDENTS  WITH  WRITING: 

ITEMS  2545582  2600027  1724920 

PLAUSIBLE  VALUES  963071  1430241  963071 


3420689 
1430241 


ERIC 


589 


605 


NAEP  1903-8^  READING  AND  MIITIN6  ASSESSHENT  -  ^TH  GRADERS 
HEIGHTED  rtEAHS»  STANDARD  DEVIATI(ff4(  N-1 ) »  AW  PERCENTILES  FOR 


GENERAL  READING  l>ROFICIENCY    (AVERAGE  OF  B  PUUSIBLE  VALUES) 

N  WEIGHTED  N  MEAN 


—  TOTAL  —  178^0 
SEX 

NALE  9063 

FEMALE  £777 

ETHNICITY/RACE 

WHITE  11762 

BLACK  2796 

HISPANIC  2<c»69 

OTMER  793 

REGION 

NORTHEAST  {^06l 

SOUTHEAST  k^520 

CENTRAL  K^9K^0 

WEST  4319 

PARErrTAL  EDUCATION 

HOT  GRADU/VTED  H.S.  1126 

GRADUATED  H.S.  3650 

POST  H.S.  663<Cf 

UaCNOlM  6272 

SIZE/TYPE  OF  COMMUNITY 

RURAL  1156 

DISADVANTAGED  URBAN  Z^ZS 

AOVAffTAGED  U:7BAN  2055 

BIG  CITIES  IZ^S 

fRIKGE  OF  BIG  CITIES  IB^l 

MEDIUM  CITIES  2755 

SMALL  PUCES  626i 

AGE 

8  OR  YOUr>'GER  130 

9  YEARS  OLD  11507 

10  OR  OLDER  6203 


2837712( 

07.) 

2X7. <♦( 

0.7) 

1^1266^( 

IX) 

Zl^.Qi 

0.9) 

l<Cf250<c*7( 

17.) 

220. 0( 

0.7) 

2002^4^( 

17.) 

22^. 8( 

0.9) 

<c*31187( 

17.) 

195. 0( 

1.3) 

32262<c*( 

Z7.} 

200. 7( 

1.0) 

81^56( 

^7.) 

221. 1( 

1.8) 

6333^2( 

Z7.) 

220. 8( 

1.61 

673333C 

57.) 

212. 6( 

1.6) 

757011C 

57.) 

221. 2( 

1.8) 

77<c»026( 

Z7.) 

215. 0( 

1.3) 

16613^( 

57.) 

200. 0( 

1.2) 

570097C 

Z7.) 

215.<E^( 

0.8) 

107573<^C 

Z7.) 

227. ^( 

1.1) 

1001C15( 

Z7.) 

211. 3( 

0.8) 

1836<^5{17X) 

208. 7( 

2.5) 

359699(16/C) 

197. 9( 

1.5) 

399737C15/C) 

234. 7( 

2.3) 

21<c*3<c*<c*(20;C) 

21<^.1( 

2.6) 

313591(1<E^;C) 

219. 3( 

1.6) 

97.) 

218. 7( 

2.3) 

909162( 

57.) 

218. 7( 

0.9) 

17668(11/:) 

226. 3( 

3.8) 

2031707( 

OX) 

221. 8( 

0.8) 

788337( 

17.) 

206. 0( 

0.9) 

!    INTERPRET  MITW  CAUTION.    STAflDARD  ERRORS  ARE  POORLY  ESTIMATED 


COS 


Table  15(15) 

GROUPS 


ST.  DEV. 
38. 0(  0.4) 


39. 3(  0.5) 
36. 6(  0.5) 


36. 1(  0.4) 

35. 3(  0.8) 

36. 9(  0.7) 

37. 0(  1.5) 


37. 1(  l.D) 

33. D(  0.6) 

37. 7(  0.8) 

38. 5(  0.9) 


36. 2(  1.1) 

36. 4(  0.6) 

38. 2(  0.5) 

36. 3(  0.5) 


38. 6(  0.8) 

37. 5(  1.4) 

36. 6(  1.1) 

36. 7(  0.9) 

35. 6(  0.7) 

36. 2(  0.7) 

36. 7(  0.5) 


38. 8(  2.7) 
36. 8(  0.4) 
30.9(15.5) 


-  10  - 
168. 4(  0.9) 


164. 3(  0.9) 
173. 3(  1.3) 


178. 3(  1.4) 

150. 1(  3.2) 

153. 3(  1.6) 

172. 6(  2.8) 


172. 9(  2.7) 

163. 9(  1.0) 

172. 4(  3.3) 

165. 4(  1.9) 


153. 8(  2.6) 

168. 3(  1.7) 

177. 1(  1.5) 

164. 5(  1.9) 


158. 3(  3.3) 

149. 5(  2.7) 

188. 2(  2.5) 

166. 2(  2.5) 

173. 4(  2.5) 

171. 4(  2.9) 

171. 1(  1.5) 


174. 5(  5.8) 
174. 7(  1.3) 
156. 0(  1.3) 


-  25  <• 
191. 9(  0.7) 


168. 1(  0.9) 
195. 8(  0.8) 


200. 8(  0.9) 

171. 3(  1.5) 

176. 2(  1.6) 

196. 5(  3.3) 


1<J5.9(  1.9) 

187. 3(  1.7) 

195. 7(  2.0) 

189. 3(  2.0) 


176. 1(  2.0) 

191. 0(  1.1) 

202. 0(  1.0) 

IH7.U  0.8) 


162. 0(  3.0) 

172. 7(  1.4) 

210. 3(  3.2) 

169. 1(  3.8) 

195. 3(  1.7) 

19^4. 5(  2.4) 

1^3. 9(  1.3) 


1<J9.9(  6.1) 
1^7. 3(  1.2) 
179. 7(  1.2) 


-  50  - 
218. 0(  0.8) 


215. 2(  1.2) 
220. 4(  0.8) 


225. 3(  0.9) 

195. 6(  1.4) 

201. 3(  1.4) 

222. 0(  1.4) 


221. 2(  2.0) 

212. 9(  2.0) 

222. 1(  1.6) 

215. 4(  1.7) 


200. 7(  1.1) 

216. 5(  1.1) 

228. 6(  1.3) 

211. 6(  1.0) 


209. 0(  2.5) 

197. 5C  1.3) 

234. 8(  2.6) 

214. 4(  3.1) 

220. 2(  2.5) 

219. 1(  2.7) 

219. 0(  1.0) 


229. 6(  4.7) 
222. 2(  0.8) 
205. 6(  0.8) 


-  75  - 
243. 7(  1.0) 


242. 4(  i.3) 
244. 8(  1.1) 


249. 4(  1.1) 

218. 4(  1.4) 

226. 3(  1.3) 

246. 2(  3.1) 


246. 5(  2.1) 

238. 5(  1.6) 

247. 2(  2.1) 

241. 7(  1.5) 


224. 3(  2.0) 

240. 1(  1.0) 

254. 0(  1.6) 

236. 1(  0.9) 


235. 6(  3.1) 

223. 7(  2.2) 

259. 8(  1.4) 

239. 9(  2.8) 

243. 6(  2.2) 

243. 7(  2.6) 

243. 6(  1.1) 


254.7(15.1) 
247. 0(  l.Z) 
232. 4(  1.1) 


-  90  - 
266. 0(  1.1) 


265. 4(  1.8) 
266. 7(  1.2) 


271. 1(  1.0) 

239. 3(  1.9) 

247. 8(  1.6) 

266.6(  2.0) 


266. 7(  2.3) 

260. 9(  2.0) 

269. 8(  2.3) 

263. 5(  2.2) 


245. 1(  1.3) 

260. 9(  0.8) 

275. 4(  1.0) 

257. 8(  1.2) 


257. 6(  2.3) 

246. 6(  3.7) 

261. 2(  2.1) 

260. 3(  2.1) 

263. 8(  2.6) 

264. 6(  3.5) 

266. 0(  1.4). 


272.2(10.2) 
269. 2(  1.2) 
256. <t(  1.4) 


607 

590 


NAtP    1983-84  IIEADIH6  AND  WRITIH6  ASSESSHENT    -    8TH  GRADERS 
WEIGHTED  HEAMS.  STAICARO  DEVIATION! N-1 ),  AND  PERCENTILES  FOR  REPORT 


GENERAL  READING  PROFICIENCY    (AVERAGE  OF  5  PUUSIBLE  VALUES) 

N  MEI6HTED  N  MEAN 


-  TOTAL  -  18173 
SEX 

HALE  9066 
FEKALE  9106 

ETHNICITY/RACE 

WHITE  12939 

BUCK  2555 

HISPANIC  20^3 
OTHER  636 

REGION 

NORTHEAST  ^109 

SOUTHEAST  ^589 

CENTRAL  5061 

WEST  ^^i^ 

PAREJfTAL  EDUCATIOfI 

NOT  GRADUATED  H.S.  1833 

GRADUATED  H.S.  6^^^ 

POST  H.S.  8117 

UNKNOWN  1609 

SIZE/TYPE  OF  COMMUNITY 

RURAL  1082 

DISADVANTAGED  URBAN  1812 

ADVANTAGED  URBAN  1977 

BIG  CITIES  1839 

FRirrcE  OF  BIG  CITIES  2^75 

MEDIUM  CITIES  250^ 

SMALL  PUCES  6^8^ 

AGE 

12  OR  YOUNGER  15^^ 

13  YEARS  OLD  120^3 
1^  OR  OLDER  5976 


2761^59( 

OX) 

260. 6( 

0.5) 

1380877( 

257. 0( 

0.6) 

1360<»77( 

IX) 

264. 1( 

0.6) 

20^4478( 

OX) 

266. 5( 

0.6) 

398198( 

IX) 

240. 1( 

1.1) 

2^11d9( 

2X) 

242. 9( 

1.3) 

77593( 

3X) 

264. 3( 

1.6) 

627702 ( 

2X) 

262. 4( 

1.0) 

6^8087( 

6X) 

259. 8( 

1.4) 

72656 0( 

5X) 

262. 1( 

1.2) 

759110( 

2X) 

258. 3( 

0.7) 

262530( 

5X) 

244. 4( 

0.7) 

974^^5 ( 

3X) 

255. 6( 

0.7) 

1261623( 

2X) 

271. 4( 

0.7) 

234214( 

5X) 

241. 5( 

1.1) 

146720(21X)! 

259. 7( 

2.3) 

244034(17X) 

241. 9( 

1.9) 

292740(22X)! 

276. 3( 

2.6) 

286912(32X)! 

256. 7( 

1.7) 

456315(18>C) 

261. 6( 

1.2) 

420128(18X) 

259.  9( 

2.5) 

914603( 

7X) 

261. 0( 

0.9) 

21346 (16X) 

265. 5( 

4.3) 

ie88098( 

OX) 

266. 2( 

0.6) 

852015( 

IX) 

247. 9( 

0.8) 

!    INTERPRET  WITH  CAUTION.    STANDARD  ERRORS  ARE  POORLY  ESTIMATED 


608 


Table  15(16) 

GROUPS 


ST.  DEV. 
34. 9(  0.3) 


35. 2(  0.4) 
34. 3(  0.4) 


33. 2(  0.4) 

33. 0(  0.6) 

33. d(  0.8) 

34. 1(  1.0) 


34. 8(  0.6) 

36. 2(  O.d) 

34. 0(  0.4) 

34. 7(  0.6) 


33. 1(  0.8) 

32. 9(  0.4) 

33. 2(  0.4) 

34. 3(  0.6) 


33. 3(  0.8) 

33. 9(  1.4) 

32. 2(  0.7) 

33. 2(  1.1) 

34. 5(  0.7) 

35. 5(  0.9) 

34. 3(  0.4) 


35. 4(  2.1) 
33. 2(  0.4) 
28.3(14.2) 


-  10  - 
215. 6(  0.7) 


211. 5(  1.1) 
219. 9(  1.0) 


224. 3(  1.4) 

197. 4(  1.7) 

197. 7(  2.1) 

220. 1(  3.1) 


217. 7(  1.0) 

213. 4(  2.0) 

217. 9(  2.0) 

213. 4(  1.9) 


201. 9(  0.9) 

213. 5(  1.3) 

228. 7(  1.0) 

197. 3(  1.5) 


216. 0(  3.2) 

198. 7(  3.6) 

236. 1(  3.7) 

216. 2(  2.3) 

216. 7(  1.3) 

213. 6(  3.0J 

216. 7(  1.6) 


216. 6(  6.7) 
223. 9(  1.2) 
202. 4(  0.8) 


-  25  - 
238. 1(  0.6) 


234. 0(  0.8) 
241. 4(  0.8) 


2^4. 6(  0.8) 

218. 4(  1,0) 

2c0.3(  2.5) 

240. 9(  1.9) 


240. 0(  1.3) 

235. 6(  1.9) 

239. 9(  1.3) 

235. 7(  1.1) 


2c2.4(  1.0) 

234. 2(  1.4) 

249. 9(  1.1) 

219. 1(  1.4) 


237. 9(  3.0) 

219. 7(  2.4) 

2E4.9(  2.5) 

237. 7(  2.4) 

239. 0(  1.5) 

236. 6(  3.7) 

238. 7(  1.1) 


243. 1(  5.3) 
244. 5(  0.9) 
224. 5(  1.0) 


-  50  - 
261. 3(  0.6) 


258. 1(  0.8) 
264. 5(  0.8) 


267. 1(  0.9) 

241. 2(  1.3) 

244. 1(  1.4) 

264. 6(  3.2) 


263. 5(  1.5) 

260. 5(  1.7) 

262. 5(  1.4) 

259. 0(  0.9) 


245. 2(  1.4) 

256. 2(  0.7) 

272. 3(  1.1) 

242. 0(  1.3) 


260. 6(  2.9) 

242. 8(  1.9) 

277. OC  2.3) 

259. 3(  1.4) 

262. 3(  1.2) 

261. 1(  2.5) 

261. 7(  1.0) 


267. 9(  4.0) 
266. 7(  0.8) 
248. 0(  0.8) 


-  75  - 
264. 4(  0.6) 


281. 4(  0.9) 
287. 1(  0.6) 


288. 8(  0.8) 

262. 2(  1.5) 

265. 6(  1.4) 

288. 0(  1.9) 


285. 8(  0.7i 

284. 7(  2.2) 

2^5. 3(  1.1) 

281. 8(  1.0) 


266. 6(  1.2) 

277. 6(  0.7) 

294. 0(  0.8) 

263. 7(  1.9) 


282. 6(  3.7) 

263. 9(  2.5) 

297. 9(  3.2) 

280. 3(  3.3) 

285. 1(  0.8) 

284. 5(  1.9) 

284. 2(  0.9) 


289. 3(  7.9) 
208. 4(  0.8) 
272. 0(  1.0) 


-  90  - 
304. 9(  0.8) 


300. 5(  1.2) 
308. 2(  0.7) 


308. 9(  0.9) 

281. 5(  1.2) 

204. 9(  1.4) 

307. 4(  4.6) 


306. 1(  0.9) 

306. 1(  2.1) 

305. 4(  2.2) 

302. 3(  1.3) 


286. 1(  1.5) 

297. 2(  0.7) 

312. 6(  1.0) 

284. 7(  1.3) 


301. 6(  2.0) 

284. 6(  2.6) 

317. 4(  2.3) 

300. 1(  2.7) 

304. 9(  1.8) 

304. 2(  3.1) 

304. 9(  1.1) 


309. 0(  5.7) 
308. 6(  0.8) 
293. 2(  1.2) 


609 

591 


HAEP    X9d3-84  READING  AND  WRITING  ASSESSMENT    -    llTH  GRADERS 
HCIGHTEO  MEANSt  STANDARD  DEVIATION! N-l ) ,  AND  PERCENTILES  FOR  REPORTING  GROUPS 


Table  15(17) 


GENERAL  READING  PROFICIENCY    (AVERAGE  OF  5  PUUSIBLE  VALUES) 


TOTAL  — 

SEX 
HALE 
FEHALE 

ETHNICITY/RACE 
MUTE 
BLACK 
HISPATUC 
OTHER 

REGION 
NORTHEAST 
SOUTHEAST 
CENTRAL 
WEST 

PARENTAL  EDUCATION 
NOT  GR/DUATEO  H.S. 
GRADUATED  H.S. 
POST  H.S. 
UA<»40k*N 

SIZE/TYPE  OF  COflHUNITY 
RURAL 

disadvantaged  urban 
aov^'^ktageo  urban 
big  cities 
frij«;e  of  big  cities 
medium  cities 
small  puces 

AGE 

16  OR  YO^JMSER 

17  YEARS  OLD 

18  OR  OLDER 


N 

19080 


9^^  3 
9637 


13914 
2792 
1699 
675 


4318 
4873 
5303 
4586 


2300 
6600 
9378 
596 


1217 
1958 
2546 
1702 
1848 
3210 
6519 


1992 
14009 
3079 


WEIGHTED  N 
2563822(  0^> 


1292364(  2^1 
1271457(  2X) 


190^547(  07.} 

383493(  17.) 

203453(  Z7.} 

70329(  17.) 


628410 (  Z7.} 

564382 (  Q7.) 

702531  (  67.) 

668498  (  27.) 


293458(  57.} 

865215(  17.) 

1301603(  Z7.) 

78893(  57.} 


137029(22/^)! 

272210(21X1! 

420168(16X) 

224698(24X1! 

262515(26X)! 

429e62(  97.) 

817320(  57.) 


334011(  67.) 
1699603(  OX) 
530128(  3X) 


MEAN 
289. 1(  0.8) 


284. 1(  1.0) 
294. 2(  0.9) 


295. 8(  0.9) 

266. 6(  1.8) 

269. 2(  2.0) 

267. 2(  2.2) 


290. 1(  2.7) 

287. 1(  1.7) 

290. 7(  1.7) 

288. 3(  0.8) 


269. 5(  1.2) 

281. 3(  0.7) 

300. 5(  0.9) 

260. 3(  2.1) 


284. 6( 
266. 7( 
300. 6( 
290. 1( 
289. 9( 
292. 4( 
289. 2( 


3.2) 
2.5) 
3.0) 
2.4) 
1.3) 
1.2) 
1.0) 


ST.  DEV. 
38. 9(  0.3) 


39. 3(  0.4) 
37. 8(  0.4) 


37. 0(  0.3) 

36. 1(  0.6) 

38. 2(  0.7) 

40. 9(  1.4) 


39. 6(  0.6) 

39. 9(  0.6) 

37. 7(  0.8) 

38. 6(  0.7) 


36. 9(  0.6) 

36. 9(  0.6) 

36. 7(  0.4) 

37. 3(  1.5) 


38. 2^ 
37. 8( 
38. 9( 
37. 0( 
36. 6( 
36. 2( 
37. 9( 


1.0) 
0.9) 
1.0) 
1.2) 
0.6) 
0.8) 
0.4) 


-  10  - 
239. 1(  1.0) 


232. 3(  1.7) 
245. 1(  1.6) 


248. 7(  1.2) 

219. 8(  3.1) 

213. 9(  1.6) 

233. 2(  6.3) 


236. 6(  2.1) 

23i*.9(  2.0) 

241. 5(  2.8) 

236. 6(  1.3) 


221. 4(  2.2) 

233. 2(  1.7) 

253. 3(  1.2) 

213. "J  6.0) 


3.2) 
3.7) 


235. 6( 
217. 8i 

250. 4(  6.0) 

242. 4(  6.1) 

242. 6(  2.4) 
242. 3( 
239. 8( 


1.3) 
1.2) 


-  25  - 
264. 1(  0.9) 


216. 5(  1.1) 
269. 1(  1.0) 


271. 9(  0.9) 

2^,2. 4(  1.5) 

243. 5(  3.6) 

^61.1(  4.0) 


264. 2(  3.1) 

2t0.4(  2.0) 

266. 5(  1.8) 

263. 5(  1.2) 


2^4. 2(  1.6) 
£57. 3(  0.8) 
276. 8(  0.8) 
2156. 1(  4.1) 


257. 9( 
241. 6( 
276. 0( 
266. 9( 
265. 6( 
267. 3( 
264. 9( 


3.7) 
2.1) 
3.4) 
1.8) 
1.9) 
1.6) 
1.0) 


-  50  - 
290. 4(  0.7) 


285. 5(  0.9) 
295. 1(  1.0) 


296. 7(  1.0) 

266. 7(  2.6) 

270. 3(  2.6) 

206. 5(  2.2) 


291. 3(  2.5) 

200. 2(  2.0) 

291. 9(  1.4) 

289. 2(  1.3) 


269. 9(  1.4) 

282. 4(  0.9) 

301. 4(  1.0) 

260. 4(  2.9) 


205. 4( 
266. 9( 
302. 1( 
291. 3( 
290. 1( 
293. 5( 
290. 4( 


3.3) 
2.6) 
3.2) 
2.0) 
0.9) 
1.5) 
0.7) 


299. 8(  1.4) 
294. 8(  0.7) 
264. 1(  1.3) 


36. 2(  0.8) 
36. 5(  0.4) 
30.X(15.1) 


253. 3(  2.5) 
248. 0(  1.4) 
215. 9(  1.7) 


275. 7(  1.1) 
271. 0(  1.0) 
238. 9(  1.2) 


300. 6(  2.3) 
295. 6(  0.6) 
264. 5(  1.0) 


-  75  - 
315. 9(  0.9) 


311. 0(  1.2) 
320. 1(  1.1) 


321. 0(  0.8) 

291. 4(  2.5) 

294. 9(  2.3) 

315. 3(  2.3) 


317. 7(  2.9) 

314. 7(  1.7) 

316. 5(  1.6) 

314. 4(  1.5) 


295. 5(  -..4) 

306. 3(  1.0) 

325. 3(  0.8) 

285. 5(  3.5) 


311. 0(  3.9) 

292. 4(  2.8) 

327. 5(  2.5) 

315. 2(  2.0) 

315. 0(  1.4) 

316. 4(  1.6 

314. 9(  1.1) 


323. 8(  1.6) 
319. 9(  0.7) 
289. 7(  1.7) 


-  90  - 
337. 8(  0.9) 


333. 9(  1.1) 
342. 2(  1.4) 


342. 4(  1.2) 

312. 6(  2.5) 

317. 9(  2.4) 

338. 2(  3.7) 


339. 8(  2.7) 

337. 8(  1.7) 

337. 4(  1.9) 

336. 8(  1.1) 


317. 1(  1.1) 

327. 3(  1.0) 

346. 6(  1.1) 

307. 7(  4.9) 


333. 0( 
314. 4( 
346. 8( 
3Z6.0( 
335. 0( 
340. 6( 
336. 6( 


2.4) 
3.5) 
2.4) 
2.9) 
1.6) 
2.1) 
1.3) 


346. 2(  2.5) 
340. 9(  0.8) 
311. 5(  1.3) 


•    INTERPRET  WITH  CAUTION.    STANDARD  ERRORS  ARE  POORLY  ESTIMATED. 


610 


611 


ERIC 


592 


!SfLT^2^?!L^^***'^.f*  ^^^^^         -  4THeRA0ERs  Table  15(18) 

^  NWfiHTtO  MEANS,  STANDARD  DCVIATIONCN-l),  AND  PERCENTILES  FOR  REPORTINS  GROUPS 

A.R.H*  NRXTINS  PROFICIENCY    f AVERAGE  OF  5  PUUSIBLE  VALUES) 


—  TOTAL  — 

SEX 
HALE 
FEIULE 

ETNNICITY/RACE 
WHITE 
BUCK 
HISPANIC 
OTHER 

REGION 
NORTHEAST 
SOUTHEAST 
CENTRAL 
NEST 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 
GRADUATED  H«S. 
POST  H.S* 

SIZE/TYPE  OF  COMMUNITY 
RUPAL 

DISADVANTAGED  L*RBAN 
ADVANTAGED  URBAN 
BIG  CITIES 

FRINGE  OF  BIS  CITIES 
MEDIUM  CITIES 
SMALL  PUCES 

AGE 

8  OR  YOUNGER 

9  YEArS  OLO 

10  OR  OLDER 


N 
6807 


4410 
4397 


5931 
1298 
1169 
409 


2021 
2225 
2457 
2104 


526 
1793 
3364 
3067 


559 
1102 
1077 
646 
926 
1392 
3105 


75 
5795 
2937 


WEIGHTED  N 
1408047(  IX) 


694799C  2X) 
7132481  IX) 


1016633C  IX) 

1964461  2X) 

152335(  5X) 

42633(  7X) 


3165851  2X) 

331763C  6X) 

3C4599C  5X) 

3751011  3X) 


777451  5X) 

275493C  4X) 

549929C  3X) 

495435 (  2X) 


669971 15X) 
1626911 17X) 
2C78I>0(15X) 
1037431  22X)* 
1536251 14X) 
2332301  8X) 
4579111  5X) 


9566(15X) 
10268131  iX) 
3716671  2X) 


MEAN 
1*58C0*01> 


1,50C0.01) 
1.66(0.01) 


1.63C0.01) 
1.38(0.02) 
1.46(0.02) 
1.60(0.03) 


1.61C0.02) 
1.54(0.02) 
1.60(0.02) 
1.57(0.01) 


1.43(0.03) 
1.54(0.01) 
1.66(0.01) 
1.53(0.01) 


1.53(0.02) 
1.42(0.02) 
1.70(0.02) 
1.55(0.03) 
1.60(0.02) 
1.59(0.02) 
1.58(0.01) 


1.55(0.07) 
1.60(0.01) 
1.52(0.01) 


ST.  DEV. 
0.41(0.00) 


0.40(0.01) 
0.41(0. .1) 


0.40(0.01) 
0.40(0.01) 
0.40(0.01) 
0.40(0.02) 


0.41(0.01) 
0.41(0.01) 
0.40(0.01) 
0.41(0.01) 


0.39(0.02) 
0.40(0.01) 
0.41(0.01) 
0.40(0.01) 


0.40(0.02) 
0.41(0.01) 
0.40(0.01) 
0.41(0.02) 
0.40(0.01) 
0.40(0.01) 
0.40(0.01) 


0.40(0.05) 
0.41(0.01) 
0.37(0.12) 


-  10  - 
0.96(0.01) 


0.90(0.01) 
1.04(0.02) 


1.03(0.02) 
0.82(0.02) 
0.87(0.02) 
0.99(0.05) 


0.98(0.02) 
0.92(0.01) 
0.93(0.02) 
0.94(0.02) 


0.67(0.03) 
0.93(0.02) 
1.05(0.02) 
0.92(0.01) 


0.92(0.04) 
0.84(0.02) 
1.13(0.06) 
0.93(0.03) 
0.98(0.03i 
0.96(0.02) 
0.97(0.01) 


0.96(0.11) 
0.98(0.01) 
0.90(0.01) 


-  25  - 
1.29(0.01) 


1.22(0.02) 
1.35(0.01) 


1.3^(0.01) 
1.06(0.02) 
1.14(0.03) 
1.32(0.04) 


1.31(0.02) 
1.26(0.01) 
1.31(0.02) 
1.28(0.02) 


1.14(0.05) 
1.26(0.02) 
1.36(0.01) 
1.26(0.01) 


1.24(0.05) 
1.10(0.03) 
1.39(0.02) 
1.20(0.02) 
1.31(0.02) 
1.29(0.02) 
1.30(0.01) 


1.29(0.14) 
1.31(0.01) 
1.23(0.02) 


-  50  - 
1.57(0.01) 


1.50(0.01) 
1.64(0.01) 


1.62(0.01) 
1.39(0.01' 
1.45(0.02) 
1.59(0.03) 


1.60(0.02) 
1.54(0.01) 
1.59(0.02) 
1.56(0.01) 


1.44(0.03) 
1.54(0.01) 
1.65(0.01) 
1.52(0.01) 


1.52(0.03) 
1.43(0.02) 
1.69(0.03) 
1.55(0.03) 
1.59(0.03) 
1.58(0.02} 
1.37(0.01) 


1.55(0.07) 
1.59(0.01) 
1.52(0.01) 


-  75  - 
1.90(0.01) 


1.79(0.02) 
1.98(0.01) 


1.96(0.01) 
1.67(0.02) 
1.73(0.02) 
1.92(0.04) 


1.94(0.02) 
1.85(0.03) 
1.92(0  02) 
1.89(0.02) 


1.70(0.03) 
1.85(0.02) 
2.00(0.01) 
1.63(0.02) 


1.82(0.05) 
1.71(0.02) 
2.03(0.02) 
1.87(0.04) 
1.92(0.04) 
1.92(0.03) 
1.90(0.02) 


1.84(0.13) 
1.93(0.01) 
1.82(0.03) 


-  90  - 
2.17(0.01) 


2.10(0.01) 
2.21(0.01) 


2.19(0.01) 
1.99(0.03) 
2.07(0.03) 
2.18(0.03) 


2.18(0.01) 
2.14(0.02) 
2.17(0.02) 
2.16(0.02) 


2.03(0.03) 
2.14(0.01) 
2.21(0.01) 
2.13(0.01) 


2.12(0.02) 
2.05(0.02) 
2.24(0.02) 
2.15(0.02) 
2.17(0.02) 
2.17(0.02) 
2.16(0.01) 


2.17(0.10) 
2.18(0.01) 
2.13(0.01) 


INTERPRET  WITH  CAUTION.    STANDARD  ERRORS  ARE  POORLY  ESTIMATED, 


ERIC  612 


593 


613 


mMP    IWS-M  ffCAOXNG  ATO  NRXTZNG  ASSCSSUCKT    -    tlH  CfTAOERS 
WXfiHTCO  MCANSt  STAIOAffO  OEVIATXONCN-Ilt  AND  PCRCCKTZLCS  FOR  RCPORTXHG  GIK)UPS 


Table  15(19) 


A.II.N.  mirtm  RWXciCMcy  i average  of  s  puusibu  values i 


TOTAL  — 
SEX 

hau 

FEtUU 

ETNNICm/RACE 
MHXTE 
BLACK 
HXSPANXC 
OTHER 

REOXON 

MORTHEAST 
SOUTHEAST 
CENTRAL 
MEST 

RARl'ITAL  eoucatxon 

NOT  €RAnJATEO  H.S. 
6RA0UATE0  H.S. 
POST  H.S. 
iMCNOIdl 

SXZE/TTPE  OF  COItlUNITY 
RURAL 

OZSADVANTAGEO  URBAN 
ADVANTAGED  URBAN 
BX6  CXTXES 
FRXMGE  OF  BXG  CXTXES 
NEDXUH  CirXES 
SHALL  PLACES 

AGE 

12  OR  YOIRIGER 
IS  YEARS  OLD 
14  OR  OIDER 


N 

11092 


5466 
5606 


7916 
1500 
1271 
405 


2516 
2759 
3069 
2726 


1072 
3667 
5036 
1000 


645 
1073 
1229 
1130 
1536 
1551 
3926 


69 

7420 
3563 


HEX6HTED  N 
16621921  1X1 


6397761  1X1 
6424161  1X1 


12496371  IX) 

2329311  2X1 

1502121  4X) 

492121  4X1 


36200Ci  2X) 

3666951  r/l 

443166 C  5X) 

4663251  2X) 


152 336 C  SX) 

564033C  3X) 

7650 37C  3X) 

1446651  5X) 


661121 21X)! 
1429651 16X) 
1613171 22X)! 
177473C34X>! 
2626271 16X) 
257076C16X) 
5544021  r/) 


127631  ir/) 
11556031  IX) 
5136261  2X) 


HE  AN 
2.05(0.01) 


1.96(0.01) 
2.14(0.01) 


2.11(0.01) 
1.66(0.01) 
1.67(0.02) 
2.09(0.03) 


2.09(0.01) 
2.03(0.02) 
2.06(0.01) 
2.03(0.02) 


1.69(0.02) 
2.02(0.01) 
2.13(0.01) 
1.90(0.02) 


2.03(0.03) 
1.66(0.02) 
2.21(0.02) 
2.01(0.02) 
2.07(0.02) 
2.04;0.03) 
2.05(0.01) 


2.07(0.06) 
2.06(0.01) 
1.96(0.01) 


ST.  DEV. 
0.40(0.00) 


0.39(0.00) 
0.39(0.00) 


0.39(0.00) 
0.36(0.01) 
0.39(0.01) 
0.39(0.02) 


0.40(0.01) 
0.41(0.01) 
0.39(0.01) 
0.40(0.01) 


0.36(0.01) 
0.36(0.01) 
0.39(0.00) 
0.39(0.02) 


0.36(0.02) 
0.36(0.01) 
0.39(0.01) 
0.39(0.01) 
0.39(0.00) 
0.40(0.01) 
0.39(0.01) 


0.37(0.03) 
0.39(0.00) 
0.33(0.17) 


-  10  - 
1.45(0.01) 


1.36(0.01) 
1.56(0.01) 


1.52(0.01) 
1.32(0.01) 
1.32(0.02) 
1.49(0.04) 


1.49(0.01) 
1.42(0.02) 
1.46(0.01) 
1.43(0.01) 


1.34(0.02) 
1.43(0.01) 
1.54(0.01) 
1.34(0.01) 


1.44(0.04) 
1.33(0.02) 
1.66(0.05) 
1.42(0.02) 
1.47(0.02) 
1.43(0.02) 
1.45(0.01) 


1.54(0.13) 
1.48(0.01) 
1.36(0.01) 


-  25  - 
1.76(0.01) 


1.66(0.01) 
1.65(0.01) 


1.63(0.01) 
1.56(0.02) 
1.55(0.02) 
1.62(0.03) 


1.61(0.01) 
1.76(0.02) 
1.79(0.01) 
1.76(0.01) 


1.59(0.03) 
1.77(0.01) 
1.64(0.01) 
1.59(0.02) 


1.77(0.03) 
1.57(0.03) 
1.90(0.02) 
1.75(0.04) 
1.60(0.02) 
1.77(0.02) 
1.79(0.01) 


1.64(0.04) 
1.61(0.01) 
1.66(0.02) 


-  50  - 
2.04(0..  ' ) 


1.97(0.01) 
2.12(0.01) 


2.09(0.01) 
1.66(0.02) 
1.69(0.02) 
2.09(0.04) 


2.06(0.01) 
2.02(0.01) 
2.05(0.01) 
2.03(0.01) 


1.91(0.02) 
2.02(0.01) 
2.12(0.01) 
1.91(0.02) 


2.02;0.02) 
1.90(0.02) 
2.19(0.02) 
2.02(0.02) 
2.06(0.02) 
2.03(0.02) 
2.05(0.01) 


2.07(0.06) 
2.07(0.01) 
1.98(0.01) 


-  75  - 
2.35(0.01: 


2.23(0.01) 
2.45(0.011 


2.41(0.01) 
2.14(0.02) 
2.16(0.02) 
2.41(0.05) 


2.40C0.C2) 
2.33(0.03) 
2.35(0.02) 
2.33(0.03) 


2.17(0.02) 
2.29(0.01) 
2.45(0.01) 
2.16(0.03) 


2.30(0.0.) 
2.16(0.02) 
2.52(0.02) 
2.30(0.05) 
2.37(0.03) 
2.34(0.04) 
2.35(0.32) 


2.35(0.11) 
2.39(0.01) 
2.25(0.01) 


-  90  - 
2.64(0.0X1 


2.56(0.01) 
2.69(0.01) 


2.67(0.01) 
2.42(0.04) 
2.47(0.03) 
2.67(0.03) 


2.66(0.01) 
2.64(0.02) 
2  63(0.011 
2.6213. 02i 


2.46(0.03) 
2.60(0.01) 
2.69(0.01) 
2.49(0.04) 


2.61(0.021 
2.45(0.041 
2.73(0.02) 
2.60(0.02) 
2.64(0.01) 
2.64(0.02) 
2.64(0.011 


2.63(0.051 
2.66(0.01) 
2.56(0.01) 


!    INTERPRET  MXTH  CAUTXON.    STANDARD  ERRORS  ARE  POORLY  ESTXHATEO. 


614 


613 


ERIC 


594 


19a3-a4  READING  AMD  W?ITIMS  ASSESSMENT    -    hth  GRADERS 
WEIGHTED  r:EANS»  STAJiDARO  DEVIATION! N-l) ,  AND  PERCENTILES  FOR  REPORT 


A.R.h*  SUITING  PROFICIENCY    C AVERAGE  OF  5  PUUSIBLE  VALUES) 


NEIGHTEO  N 


MEAN 


—  TOTAL  — 

10657 

IX ) 

2.19(0.01) 

SEX 

MALE 

5215 

7144181 

ZX) 

2.09(0.01) 

FEMALE 

5442 

7158231 

2.29(0.01) 

ETHNICITY/RACE 

HHITE 

7092 

10778991 

ly.) 

2.24(0.01) 

BLACK 

1476 

2056701 

Z'/,) 

2.00(0.02) 

HISPANIC 

902 

1072501 

2.00(0.02) 

OTHER 

385 

394221 

2.16(0.03) 

REGION 

NORTHEAST 

2459 

354526 C 

zy.) 

2.22(0.03) 

SOUTHEAST 

2705 

313707C 

9y.) 

2.16(0.02) 

CENTRAL 

2959 

390762 C 

ly.) 

2.20(0.02) 

UEST 

2534 

3712461 

2.17(0.01) 

PARENTAL  EDUCATION 

NOT  GRADUATED  H.S. 

1267 

159736 C 

1.99(0.02) 

GRADUATED  H«S« 

3675 

4791 73C 

2.15(0.01) 

POST  H.S. 

7404851 

2.27(0.01) 

U!;Kr;ouN 

290 

37087C 

7/.) 

1.99(0.03) 

SIZE/TYPE  OF  COtSMUNITY 

RURAL 

699 

792851 22;<)! 

2.13(0.03) 

1029 

1423481  22;<)* 

2.01(0.02) 

ADVANTAGED  URBAN 

1458 

2401211 16/{) 

2.26(0.02) 

BIG  CITIES 

996 

126101(24/^)! 

2.18(0.02) 

FRINGE  OF  BIG  CITIES 

1011 

1422411  27;^)! 

2.19(0.02) 

MEDIUM  CITIES 

1607 

2415371 

2.21(0.02) 

SMALL  PLACES 

3657 

4586081 

2.19(0.01) 

AGE 

16  OR  YOUNGER 

1102 

1842321 

6y.) 

2.23(0.02) 

17  YEARS  OLD 

7919 

9630711 

2.21(0.01) 

Id  Of?  OLDER 

1636 

2629381 

3X) 

2.06(0.02) 

!    INTERPRET  WITH  CAUTION.     STAfOARD  ERRORS  ARE  POORLY  ESTIMATED. 


ERIC 


616 


Table  15(20) 

GROUPS 


ST.  DEV. 
0.44(0.01) 


0.43(0.01) 
0.43(0.01) 


0.43(0.01) 
0.41(0.01) 
0.42(0.01) 
0.44(0.03) 


0.44(0.01) 
0.44(0.01) 
0.43(0.01) 
0.44(0.01) 


0.43(0.01) 
0.42(0.01) 
0.43(0.01) 
0.41(0.02) 


0.44(0.01) 
0.41(0.02) 
0.44(0.01) 
0.43(0.01) 
0.43(0.01) 
0.43(0.02) 
0.43(0.01) 


0.44(0.01) 
0.43(0.01) 
0.36(0.18) 


-  10  - 
1.55(0.02) 


1.45(0.01) 
1.73(0.04) 


1.65(0.03) 
1.39(0.02) 
1.37(0.02) 
1.52(0.05) 


1.58(0.05) 
1.50(0.03) 
1.53(0.04) 
1.53(0.02) 


1.36(0.02) 
1.52(0.02) 
1.69(0.04) 
1.38(0.04) 


1.48(0.04) 
1.40(0.02) 
1.70(0.05) 
1.55(0.04) 
1.57(0.03) 
1.56(0.04) 
1.56(0.02) 


1.6r:(0.05) 
1.56(0.02) 
1.43(0.02) 


-  25  - 
1.87(0.01) 


1.79(0.01) 
1.95(0.01) 


1.92(0.01) 
1.70(0.04) 
1.68(0.04) 
1.64(0.03) 


1.89(0.02) 
1.84(0.02) 
1.88(0.02) 
1.85(0.01) 


1.67(0.04) 
1.64(0.01) 
.93(0.01) 
; .70(0.07) 


1.82(0.04) 
1.72(0.05) 
1.95(0.02) 
1.86(0.02) 
1.88(0.02) 
1.66(0.02) 
3.87(0.01) 


1.90(0.02) 
1.89(0.01) 
1.78(0.02) 


-  50  - 
2.18(0.01) 


2.0^(0.01) 
2.30(0.02) 


2.24(0.01) 
2.00(0.02) 
2.00(0.02) 
2.14(0.04) 


2.22(0.03) 
2.15(0.02) 
2.19(0.02) 
2.16(0.01) 


1.99(0.02) 
2.14(0.01) 
2.27(0.01) 
2.00(0.04) 


2.12(0.05; 
2.01(0.02) 
2.30(0.02) 
2.16(0.03) 
2.19(0.03) 
2.20(0.03) 
2.19(0.02) 


2.23(0.03) 
2.20(0.01) 
2.08(0.02) 


-  75  - 
2.53(0.01) 


2.42(0.01) 
2.61(0.01) 


2.57(0.01) 
2.30(0.04) 
2.31(0.04) 
2.49(0.03) 


2.56(0.02) 
2.51(0.02) 
2.54(0.02) 
2.51(0.01) 


2.26(0.04) 
2.49(0.01) 
2.60(0.01) 
2.28(0.07) 


2.46(0.04) 
2.30(0.05) 
2.62(0.02) 
2.51(0.03) 
2.53(0.02) 
2.55(0.02) 
2.53(0.02) 


2.57(0.02) 
2.55  0.01) 
2.43(0.02) 


-  90  - 
2.75(0.02) 


2.68(0.01) 
2.91(0.02) 


2.83(0.02) 
2.62(0.02) 
2.62(0.03) 
2.73(0.04) 


2.81(0.05) 
2.73(0.01) 
2.78(0.05) 
2.74(0.01) 


2.61(0.02) 
2.72(0.01) 
2.88(0.02) 
2.60(0.03) 


2.72(0.02) 
2.62(0.02) 
2.92(0.04) 
2.75(0.03) 
2.73(0.01) 
2.78(0.05) 
2.76(0.04) 


2.82(0.06) 
2.79(0.03) 
2.68(0.02) 


617 


Table  15(21) 

HAEP   1963-6%  READIKC  AMD  MRITIN6  ASSESStnEMT    -    STWEKT  QUESTIOHMAIRE    -    4TH  GRADERS 
HEIGKTEO  RESPONSE  PERCEMTAGES  AND  GENERAL  READING  PROFICIENCY  HEANS  -  REPORTING  VARIABLES 
ftlEANS  ARE  BASED  OH  A  SINGLE  SET  OF  PUUSI8LE  VALUES) 

mWTED  STUDENT  GRADE 


—  TOTAL 


N 

17840 


NEIGHTED  N 
2837712C  0X1 


GRADE  4 

100. OC  0.01 
217. 5(  0.71 


X-ONIT 
0.0 


SLX 
HALE 


FEHALE 


9063  1412664(  1X1 
6777         1425047C  IX) 


100.01  0.0) 

215. 1(  0.9) 

100. 0(  0.0) 

220. OC  0.7) 


0.0 


0.0 


ETHNICITY/RACE 
HHITE 


BUCK 

HISPANIC 

OTHER 


11702         2002^M  IX) 


2796  431187(  IX) 


2469  3226 24C  2X) 


793 


01456(  4X) 


100. 0(  0.0) 

224. 9«  0.9) 

100. 0(  0.0) 

194. 9«  1-3) 

100. 0(  0.0) 

201. 2(  1.0) 

100.01  0.0) 

221. 6(  1.8) 


0.0 


0.0 


0.0 


0.0 


PAREflTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S. 
UNKNONN 


1126  166134C  5X) 


3650  5700971  3X) 


6634         10757341  3X) 


6272         1001015C  2X) 


100.01  0.0) 

200. 2(  1.2) 

100. 0(  0.0) 

215. 5(  0.8) 

100.01  0.0) 

227. 4(  1.1) 

100. OC  0.0) 

211. 6(  0.8) 


0.0 


0.0 


0.0 


0.0 


AGE 


9  YEARS  OLD 
10  OR  OLDER 


ERLC 


11507 


6203 


2031707C  OX) 


788337(  IX) 


100. OC  0.0) 

221.7C  0.8) 

100. OC  0.0) 

206.5C  0.9) 


0.0 


0.0 


618 


596 


Table  15(22) 

2^!!1.J?*'"5*?^*'"**        HRItn*  ASSESSMEMT    -    STUDENT  qUESTlONHAIRE    -    4TM  eOknPOt 

ir«Mi9  ARE  OASCO  ON  A  SINGLE  SET  OF  PLAUSIBLE  VALUES  I 
STUDENT  SEX 


—  TOTAL 


N 

176^0 


HEI6HTED  N 
2837712C  0X1 


HALE 

49. 8C  0.51 
215. IC  0.9) 


FEMALE 

50. 2(  0.51 
220. 0(  0.71 


X-OMIT 
0.0 


SEX 
NALE 


FEtULE 


9063         14126641  1X1 


6777         14250471  IX) 


100. OC  0.0) 
215. IC  0.9) 

0.0(  0.0) 
mmiiiiC  0.0) 


0.0(  0.0) 
0.01 

100. 0(  0.0) 
220. 0(  0.7) 


0.0 


0.0 


ETHNICITY/RACE 
MHITE 


BUCK 

HISPANIC 

OTMEP 


11782         2002444(  IX) 


2796  431187C  IX) 


2469  322624C  2X) 


793 


81456 C  4X) 


49. 4(  0.6) 

222. 8(  1.1) 

47. 6(  1.2) 

191. 3(  1.7) 

53. 8«  1.3) 

198. 9(  1.3) 

54.8(  1.8) 

215. 7C  2.1) 


50. 6(  0.6) 

226. 9(  0.9) 

52. 4(  1.2) 

198. 2C  1.5) 

46. 2C  1.3) 

204. 0(  1.4) 

45. 2(  1.8) 

228. 7t  C.5) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  6RADUATE0  H.S. 


6RA0UATE0  H.S. 


POST  H.S. 
UNKNONN 


1126  166134(  5X) 


3650 


5700971  3X) 


6634         10757341  3X) 


6272         10010151  2X) 


45. 5C  1.9) 
195. IC  1.9) 


50. 0( 
211. 5( 


0.9) 
1.1) 


52. 1(  0.7) 
224. 8(  1.4) 


47. 7C 
209. 7( 


0.9) 
0.9) 


54. 5C  1.9) 
204. 5(  1.7) 

50. 0(  0.9) 
iil9.6(  1.2) 

47. 9(  0.7) 
230. 2(  1.0) 

52. 3(  0.9) 
213. 3(  1.1) 


0.0 


0.0 


0.0 


0.0 


AGE 


9  YEARS  OLD 
10  OR  OLDER 


11507         2031707(  OX) 


6203  788337(  IX) 


46. 5(  0.6) 

220. 2C  l.Ii 

58. 3C  0.6) 

204. 3C  1.1) 


53. 5(  0.6) 

223. 1(  O.A) 

41. 7C  0.8) 

209. 5C  1.3) 


0.0 


0.0 


ERLC 


619  597 


Table  15(23) 


f MEANS  ARE  BASED  ON  A  SINGLE  SET  OF  TLAUSIBLE  VALUES) 


mtaclTY/RACE 


TOTAL 


N 

17840 


NEIGHTEO  N 
2837712C  OX) 


WHITE 

70. 6C  0.2) 
C24.9(  0.9) 


BLACK 

15. 2C  0.1) 
194. 9(  1.3) 


HISPANIC 

11. 4C  0.2) 
201. 2C  1.0) 


AHER  im 

1.3C  0.1) 
216. 4C  2.5) 


ASIAN 

1.5C 
226. OC 


0.1) 
2.9) 


UHCLASS 

O.OC  0.0) 
219.7C12.7) 


Z-OMIT 
0.0 


SEX 
KALE 


FENALE 


9063         1412664(  IX) 


8777         1425047(  IX) 


70. 0( 
222. 8C 

71. IC 
226. 9( 


0.5) 
1.1) 

0.5) 
0.9) 


14. 5C 
191. 3C 

15. 9C 
198. 2( 


0.4) 
1.7) 

0.4) 
1.5) 


12. 3C 
198. 9C 

10. 5C 
204. 0( 


0.4) 
1.3) 

0.3) 
1.4) 


1.6C 
211. 2C 

l.OC 
224. 2C 


0.1) 
3.0) 

0.1) 
3.3) 


1.6C 
219.91 

1.5C 
232. 2C 


0.2) 
3.3) 

0.1) 
3.6) 


O.OC  0.0) 
228. 9C 28.4) 

O.OC  0.0) 
211.3C11.7) 


0.0 


0.0 


ETHNICITY/RACE 
WHITE 


BUCK 

HISPANIC 

OTHER 


11782         2002444C  IX) 


2796  431187C  IX) 


2469  322624C  2X) 


793 


81456 C  4X) 


100. OC  0.0) 
224. 9C  0.9) 

O.OC  0.0) 
C  0.0) 

O.OC  0.0) 
HffmiffC  0.0) 

O.OC  0.0) 
)i)Mi)()(C  0.0) 


O.OC  0.0) 
ffffffffffC  0.0) 

100. OC  0.0) 
194. 9C  1.3) 

O.OC  0.0) 

ffffKKKC  0.0) 

O.OC  0.0) 

ffff««KC  0.0) 


O.OC  0.0) 

«ffii>iffC  0.0) 

O.OC  0.0) 

kkkkkC  0.0) 

100. OC  0.0) 

201. 2C  1.0) 

O.OC  0.0) 

IHIKKIfC  0.0) 


o.o; 

KffKKffC 

O.OC 
kkkkkC 


0.0) 
0.0) 

0.0) 
0.0) 


O.OC 


0.0) 
0.0) 


O.OC  0.0) 

kkkkkC  0.0) 

45. 3C  3.4) 

216. 4C  2.5) 


O.OC  0.0) 

HKimiiC  0.0) 

O.OC  0.0) 

ffimiiiiC  0.0) 

53. 5C  3.3) 

226. OC  2.9) 


O.OC  0.0) 
mfimifC  0.0) 

O.OC  0.0) 
ifififiiKC  0.0) 

O.OC  0.0) 
ififififiiC  0.0) 

I.IC  C.5) 
219.7C12.7) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


1126 


166134C  5%) 


POST  H.S. 
UNKNOWN 


3650  570097C  3X) 


6634         1075734C  3X) 


6272         1001015C  2X) 


61. 2C  2.2) 
207. 9C  1.4) 


71. 6C 
222. 3C 

72. 7C 
235. IC 


1.0) 
0.9) 

0.6) 
1.2) 


69. 9C  0.7) 
217. 8C  1.0) 


17. 4C  1.4) 
184. 2C  4.2) 

16. OC  0.8) 
195. 2C  2.0) 

14. 4C  0.5) 
199. 8C  1.8) 

15. OC  0.6) 
192. 3C  2.1) 


18. 8C  1.7) 
189. 2C  3.5) 

10. 5C  0.7) 
200. 3C  1.8) 

9.8C  0.4) 
209. 7C  1.7) 

11. 9C  0.6) 
19C.4C  1.4) 


1.9C  0.4) 

205. IC  9.0) 

1.3C  0.2) 

217. 9C  4.6) 

1.3C  0.1) 

221. 3C  3.8) 

1.2C  0.1) 

213. 3C  4.7) 


0.8C  0.2) 
207.1C18.3) 

0.5C  0.1) 
220. 4C  6.3) 

1.8^  0.2) 
236. 4C  3.1) 

1.9C  0.3) 
218. IC  3.5) 


O.OC  0.0) 

KKKIfffC  0.0) 

O.OC  0.0) 
kkxkkC  0.0) 

O.OC  0.0) 
219.4C30.1) 

O.IC  0.0) 
219.9C18.7) 


0.0 


0.0 


0.0 


0.0 


AGE 


9  YEARS  OLD 


Id  OR  OLDER 


ERIC 


R20 


11507         2031707C  <iX) 


6203  788337C  IX) 


72. 4C  0.3) 

228. IC  1.0) 

66. OC  0.3) 

215. 4C  1.0) 


13. 9C  0.1) 

199. IC  1.5) 

18. 3C  0.3) 

186. 6C  1.8) 


10. 8C  0.3) 

207. OC  1.3) 

12. 9C  0.4) 

188. 7C  1.4) 


598 


1.2C  0.1) 

222. 8C  2.8) 

1.6C  0.1) 

203. 5C  5.5) 


1.6C  0.1) 

229. OC  3.2) 

1.2C  0.2) 

213. 7C  3.9) 


O.OC  0.0) 
222.0C13.5) 

O.OC  0.0) 
199.3C»»»») 


0.0 


0.0 


621 


Table  15(24) 


HAEP  READING  MX)  NWITIN6  ASSESSMCMT    -    STUDEKT  QUESTIONNAIRE    -    4TH  BRAOERS 

NEI6HTE0  l^ESPONSE  FERCEKTASES  AND  6ENERAL  READING  rRDFICIENCt  MEANS  -  REPORTING  VARIABLES 
triEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

REGION 


N 

NEI6HTED  N 

NE 

SE 

CENTRAL 

HEST 

X-ONIl 

—  TOTAL 

17840 

2837712C 

07.) 

22. 3C 
221. 6C 

0.4) 
1.6) 

23. 7C 
212. 2C 

1.3) 
1.6) 

26. 7C 
221. 5C 

1.3) 
1.8) 

27.  3C 
215. OC 

0.5) 
1.3) 

0.0 

SEX 
MALE 

9063 

1412664C 

IX) 

99  Tf 

220. 5C 

v.O  9 
2.0) 

9T  At 

206. 3C 

1.41 
2.0) 

26. 4C 
218. 6C 

1.5) 
2.3 ) 

27. 9C 
213. IC 

0.7) 
1.4) 

0.0 

FEMALE 

0777 

1425047C 

IX) 

99  At 

222. 7C 

0.7 } 
1.9) 

24 .  OC 
216. OC 

1.3 ) 
1.4) 

26. 9C 
224. 2C 

1.3) 
1.3 ) 

26. 7C 
217. OC 

0.6) 
1.7) 

0.0 

ETHNICITY/RACE 
NIIITE 

11782 

2002444C 

IX) 

23. 5C 
227. 5C 

0.2) 
1.7) 

21.  IC 
222. OC 

1.5) 
1.7) 

30. 7C 
225. 7C 

1.5) 
1.6) 

24. 7C 
223. 8C 

0.2) 
2.3) 

0.0 

BUCK 

2796 

431187t 

IX) 

21. 2C 
199. 6C 

0.4) 
3.6) 

44. 6C 
192. 2C 

0.5) 
1.7) 

19. 4C 

1  OA  Cf 
X  tD  .D\ 

3.4) 
T  n  1 

^.  U  1 

14. 8C 
194. 2C 

3.5) 
3.3i 

0.0 

IITCpAllTr 

322624^ 

2X) 

17. 'A 
200. 7C 

3.3) 
3.1) 

14. 7C 
204. IC 

4.4) 
2.6) 

12. 9C 

cu / . s\ 

2.6) 

9   0  1 

54. 8C 
196. 6C 

1.0) 
1.1) 

0.0 

OTHER 

81456 ( 

4X) 

17. 1( 
221. 8( 

2.7) 
2.5) 

14.  7C 
219. 4C 

3.0) 
6.3) 

22. OC 
222.9C 

3.0) 
3.8 ) 

46. 2C 

221. 6( 

5.5) 
3.1) 

0.0 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 

1126 

166134( 

5X) 

17.lt 
204.  9( 

1.5) 
4.5) 

32. 9C 
197. 3C 

2.6) 
2.4) 

22. 3C 
203. 5C 

2.4) 
1.6) 

27. 6C 
198. 2C 

3.0) 
2.6 ) 

0.0 

GRADUATED  H.S. 

3650 

570097( 

3X) 

22. 2( 
221. 2( 

1.6) 
1.8) 

25. 9C 
209. OC 

1.9) 
1.9) 

29. 6C 
219. 6C 

2.1) 

X  .  •»  J 

22. 4C 
212. OC 

1.1) 
2.1) 

0.0 

POST  H  S 

1075734( 

3X) 

22 .  IC 
231. IC 

1.1 ) 
2.0) 

22. 6C 
222. 9C 

1.5) 
2.6) 

26. 6C 
230. OC 

1.9) 
2.2) 

28. 7C 
225. 7C 

1.3) 
1.7) 

0.0 

UTKrKTUN 

6272 

1001015( 

2X) 

23. 6C 
214. 9C 

1.0) 
2.1) 

22. 3C 
206. 7C 

1.7) 
1.2) 

26.  IC 
216. 2C 

1.6) 
1.6) 

28. OC 
208. 5C 

1.1) 
1.4) 

0.0 

AGE 

9  YEARS  OLD 

11507 

2031707( 

OX) 

22. 5C 
225. 4C 

0.4) 
1.5) 

23. 4C 
217. 8C 

1.41 
1.9) 

27. OC 
225. OC 

1.4) 
1.9) 

27. OC 
218. 9C 

0.5) 
1.6) 

0.0 

10  OR  OLDER 

6?"3 

7883371 

IX) 

21.  3C 
211. 5C 

0.5) 
3.0) 

24. 7C 
198. 2C 

1.4) 
1.3) 

25. 9C 
211. 5C 

1.5) 
2.0) 

28. IC 
205. 2C 

0.6) 
1.1) 

0.0 

ERIC 


599 


Table  15(25) 

rWEP    1983-84  READIIIG  AMD  NRITIMB  ASSE5SMEWT    -    STUDENT  QUESTIOItUIRE    -    ^Tll  GRADERS 
NEIGHTED  RESPONSE  PERCEKTABES  AND  GENERAL  READING  PROFICIENCY  MEANS  •  REPORTI'^  VARIABLES 
IHEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIDLE  VALUES) 

ntPUTEO  STODEKT  AGE 


—  TOTAL 


17840 


NEIGHTED  N 


2e37712C  OZ) 


7-LESS 

O.OC  0.0) 
1B2.7C27.0) 


0.6C  0.1) 
229. 2C  3.9) 


71. 6C  0.1) 
221. 7(  0.6) 


10 

24. 5C  0.2) 
209. OC  0.9) 


11 

2.9C  0.2) 

iee.3C  1.6) 


12-riORE 

0.4C  0.1) 
161. 5C  6.5) 


X-OHIT 
0.0 


SEX 
HALE 


9063         1412664(  VA) 


O.OC  0.0) 
165.5C16.2) 


0.5C 
234. 7C 


0.1) 
6.1) 


66. 9C  0.4) 
220. 2C  1.1) 


26. 3C  0.4) 
207. OC  1.2) 


3.7C  0.3) 
167. 2C  2.4) 


0.5C  0.1) 
179. 1(  5.7) 


0.0 


FEMALE 


8777         1425047!  VA) 


O.OC  0.0) 
223.7C»»»») 


0.7C 
225. OC 


0.1) 
4.7) 


76. 2C  0.4) 
223. IC  0.6) 


20. 6C  0.3) 
211. 7(  1.41 


2.1C  0.2) 
190. 3C  2.4) 


0.2C  0.1) 
187.9C10.2) 


0.0 


ETflNICITY/RACE 
miTE 


DUCK 

HISPANIC 

OTHER 


11782         2002444C  VA) 


2796  431167C  VA) 


2469  322624C  Z'A) 


793 


61456C  4X) 


O.OC  0.0) 
miiDDiC  0.0) 

O.OC  0.0) 
156.4C14.6) 

O.OC  0.0) 
174.5C»»»») 

O.IC  0.1) 
223.7C»»»») 


0.5C  0.1) 
241. OC  4.7) 


0.6C 
200. 6C 

0.6C 
206. IC 


0.2) 
4.6) 

0.1) 
9.7) 


1.5C  0.5) 
236.4C13.7) 


73. 5C  0.1) 

226. IC  1.0) 

65. 6C  0.4) 

199. IC  1.5) 

66. OC  0.8) 

207. OC  1.3) 

71. 6C  1.3) 

226. 3C  2.2) 


23. 9C  0.2) 

217.1C  1.1) 

26. 9C  "  7) 

166. IC  1.6) 

25. 6C  1.0) 

191. OC  1.2) 

22. 3C  1.1) 

211. OC  2.9) 


1.9C  0.2) 

196. 5C  1.9) 

6.1C  0.6) 

161. IC  4.1) 

4.6C  0.5) 

179. SC  4.4) 

3.6C  0.6) 

193. 2C  6.5) 


0.2C 
169. 7C 


0.0) 
7.2) 


0.5C  0.2) 
177. 5C  7.6) 

0.9C  0.3) 
172.2C22.1) 

0.5C  0.2) 
172.5C10.6) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EOUCATIOH 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOUN 


1126  166134C  S'A) 


3650  570097C  Z'A) 


o6i4         1075734C  YA) 


6272         1001015C  Z'A) 


O.OC  0.0) 
ffffffffffC  0.0) 

O.OC  0.0) 
ii)i)()()(C  0.0) 

O.OC  0.0' 
165.5C16.2) 

O.OC  0.0) 
223. 7C  »>Hi») 


O.IC  0.1) 
203.4(21.1) 

0.4C  0.1) 
209.5C13.3) 


0.6C 
239. OC 


0.1) 

5,4) 


0.6C  0.1) 
222. OC  5.9) 


60. 9C  1.9) 

205. IC  1.1) 

69. 3C  9.9) 

219. 4C  0.9) 

74. 9(  0.5) 

231. 2C  1.2) 

71. 4C  0.6) 

215. 3C  0.9) 


31. 5C  1.9) 

195. 6C  2.3) 

27. OC  0.6) 

209. 2C  1.5) 


22. IC 
217.6C 


0.4) 
1.5) 


24. 4C  0.5) 
204. IC  1.2) 


7.0C  0.7) 

161. 2C  4.0) 

3.0C  0.3) 

767. 7C  3.3; 

1.9C  0.2) 

193. 3C  3.4) 

3.2(  0.3) 

166. 2C  3.0) 


0.4C  0.2) 
161.1C44.6) 


0.3C 
169. 3C 


0.1) 
6.0) 


0.2c  0.1) 
179.1C10.9) 


0.4C 
162. OC 


0.1) 
5.9) 


0.0 
0.0 
0.0 
0.0 


AGE 


9  YEARS  OLD 
10  OR  OLDER 


11507         20317071  OX) 


6203 


766337C  I'A) 


O.OC  0.0) 

mmiiiiC  0.0) 

O.OC  0.0) 

Kiiio^iiC  0.0) 


O.OC  0.0) 

K^KIHIC  0.0) 

O.OC  0.0) 

mmiiiiC  0.0) 


100. OC  0.0) 

22i.7C  0.6) 

O.OC  0.0) 

<HHHf»(  0.0) 


O.OC  0.0) 
0.0) 

86.2c  0.7) 
209. OC  0.9) 


O.OC  0.0) 

ff««««C  0.0) 

10. 5C  0.6) 

188. 3C  1.6) 


r.OC  0.0) 

HKKKKC  0.0) 

1.3C  0.2) 

161.5C  6.5) 


0.0 
0.0 


Table  15(26) 

[STLTJrD;51Ji^f2S.SS.S^^  '    STUOEKT  WESTIWtlAIRE    -    4TW  GRAOERS 

?2LIS  «?T2fn'S^?*IiS^?.^.^E^^''**-  ^^^^^^  FWOFICIEMCY  HEANS  -  REPORTING  VARIABLES 
inCftNS  ARE  BASED  ON  A  SINGLE  SET  OF  PLAUSIBLE  VALUES) 

mtrryn  of  community 


TOTAL 


SEX 
HALE 


FEMALE 


ETHNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


PARENTAL  EDUCATION 
NOT  GRADUATED  H,S, 


GRADUATED  H.S. 
POST  H.S. 
UNKNONN 

AGE 

9  YEARS  OLD 
10  OR  OLDER 


N  WEIGHTED  N 

17840  2e37712C  OX) 

9063  1412664(  lAl 

8777  1425047(  IZ) 

11782  2002444(  IX I 

2796  431187(  IX) 

2469  322624 (  2X) 


RURAL 


DIS  URB 


ADV  URD 


BIG  CITY 


FRI^^E 


MEDIUM 


SMALL 


Z-OMIT 


793 


81456(  4X) 


1126  166134(  5X) 


j650  570097(  3X) 


6634         1075734C  3X) 


6272         1C01015(  2X) 


11507         2031707(  OX) 


6203  788337C  IX) 


6.5C  1.1)  l.?.7C  2.0)  14. IC  2.1 1  7.6C  1.5)  11. IC  1.6)  16. IC  1.4)  32. OC  i  7)  0  0 
207. 7C  2.5)  198. 5{  1.5)  234. 5C  2.3)  214. 2C  2.6)  219. 2C  1.6)  218. 9C  2.3)  219. 2C  0.9) 


6.6C  1.1)  12. 2C  2.0)  14. 3C  2.1)      7.8C  1.5)    11, 4C  1.8)    16. 3C  1.5)    31. 5C  1.8) 

204. 6C  2.5)  195. IC  1.9)  233. OC  3.5)  211. OC  2.9)  216. 6C  1.8)  217. 2C  3.1)  216. 3C  1.2) 

6.3C  1.2)  13. 2C  2.2)  13. 9C  2.1)      7.3C  1.6)    10. 7C  1.4)    16. OC  1.3)    32. 6C  1  7) 

;»10.9C  3.4)  201. 6C  1.6)  236. OC  2.2)  217. 6C  2.8)  221. 9C  1.8)  220. 6C  1.7)  221.9C  0.9) 


6.3C  1.1)      6.5C  1.7)    16. IC  2.5)      6.2C  1.6)    12. 2C  1.6)    16. 4C  1.3)  36. 3C  1  9) 

217.1C  ?.2)  212. 8C  2.7)  237. 2C  2.1)  222. 5C  2.8)  222. 6C  1.4)  225. 7C  2.2)  223. 7C  0.9) 

5.?C  1.6)    35.9C  6.0)      7.1C  2.0)    10. 2C  3.5)      4.6C  1.3)    13.6C  2.6)  23. 3C  3.6) 

181. .C  3.3)  190. 6C  1.7)  217. 8C  6.3)  197.3C  3.3)  198. 4C  5.9)  194.6(  2.7)  196. IC  1.9) 

10. OC  5.1)    20. 5(  5.^)      9.5C  2.4)    11,4C  3.1)    11. OC  3.6)    17. 8C  4.6)  19. 8C  2  9) 

190. IC  3.4)  187. 5C  2.5)  221. 5C  4.1)  203. 8C  2.3)  205.4C  3.4)  204.4C  1.5)  204. 7C  2.4) 

^'^^    "        ^'^^              2         ^2.0C  2.5)    17.2C  4.3)    14.6C  2.2)  21. 3C  2.9) 

197.4C  5.6)  207. OC  4.5)  236. IC  4.7)  224. 3C  4.7)  224. 7C  3.2)  220. 2C  5.2)  215.4C  3.7) 


10. 3C  2.4)    14. 4C  2.6)      4.6C  1.1)      6.8C  1,7)      7,3C  1.1)    21. OC  2.9)  35  7C  3  0) 

189.4C  3.1)  1C8.4C  3.8)  214. 6C  6.1)  198. 4C  4.0)  214. IC  4.8)  204. 8C  3.2)  201. IC  I'.B) 

9.3C  1.6)    11. IC  2.1)      6.4C  1.0)      6.5C  1.6)      9.5C  1.3)    15, 2C  1.5)  42  OC  2  7) 

212. OC  3,1)  199. OC  3.2)  225. OC  3.9)  213. 7C  2.9)  219. IC  2.1)  216. 4C  1,9)  218. 5C  1.3) 

4.8C  0.7)    11. 2C  1.8)    21. 3C  3.2)      7.5C  1.5)    11. 4C  2.0)    15. 3C  1.3)  28. 5C  1.6) 

214. 9C  3.9)  204. 6C  2.4)  240. 5C  2.5)  223. IC  3.4)  226. 5C  2.0)  229. 3C  3.7)  229.21  i.O) 

o.f'?!  i'!!    ^"^'^^  ^-^^    ^^-^^  ^'^^  1-®^         2C  1,7)    16, 9C  1.7)  29.4C  1.8) 

204.31  3.6)  195,5C  1.8)  227.7C  2.7)  208. 3C  2.9)  213. OC  2.4)  213. IC  1.9)  213. 8C  1.3) 


6.2C  J.O)  12. 5C  2.0)    15. IC  2.4)  7.9C  1.8)    11, IC  1.8)    15. 9C  1.2)    31. 4C  16) 

212. 2C  2.3)  203. 6C  1.8;  237. 6C  2.2)  218. 4C  2.8)  221. 7C  1,9)  223. OC  3.0)  223. 4C  0.9) 

7.4C  1.6)  13. OC  2.3)    11. 4C  1.5)  6.4C  1,1)    10. 8C  1,2)    16. 7C  2.3)    34  3C  2  4) 

197.9C  3.7)  185. 9C  2.7?  ZZZ.Zi  3.3)  200. 7(  3.1)  212. IC  2.2)  208. 2C  1.5)  209. IC  1.4) 


0.0 
0.0 

0.0 
0.0 
0.0 
0.0 

0.0 
0,0 
0.0 
0.0 

0.0 
0.0 


o  625 

ERIC 


601 


Table  15(27) 


mXP  RCAOINO  AKD  HRXTIfIG  ASSCSStlEirr    *    SnJDEHT  WESTIOtfUIRE    •    ^7H  6RADER8 

NCieHTEO  RESPONSE  I^RCENTASES  AND  GENERAL  REAOItS  rROFICXFNCY  NEANS  -  REPORTING  VARIABLES 
IHEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

MIENrAL  EDUCATION 


—  TOTAL  — 


1766E 


HEIGIITEO  N 
26129601  IX) 


NOT  HS 

5,9C  0,3) 
200. E(  1.2) 


GRAD  HS 

20.31  0.6) 
215. 5(  0.6) 


POST  HS 

36.21  0.9) 
227. 4(  I.l) 


UKNONN 

35.61  0.7) 
211. 6(  0.6) 


X-ONIT 
0.9 


SEX 
HAU 


FENALE 


6965 


6717 


13900271  IX) 


14149531  IX) 


5.4f  0.3) 

195.11  1.9) 

6.4(  0.4) 

204. 5(  1.7) 


20.41  0.7) 

211.51  1.1) 

20. 1(  0.7) 

219. 6(  1.2) 


40. 1(  1.1) 
224.61  1.4) 


36. 4( 
230. 2( 


1.0) 
1.0) 


34. 1(  0.9) 

209.71  0.9) 

37. OC  0.6) 

213.31  1.1) 


1.0 
0.7 


CTHNICITY/RACE 
MHITC 


BLACK 

HISPANIC 

OTHER 


11717         19913991  IX) 


2757  4253331  IX) 


2426  3158701  3X) 


762 


603771  4X) 


5.K  0.3) 
207. 9f  1.4) 

6.6(  0.5) 
164.21  4.2) 

9.9(  1.1) 
169.21  3.5) 

5.5(  1.0) 
205. 6(  6.2) 


20.51 
222.31 

21. 5( 
195.21 

19. 0( 
200. 3C 


0.7) 
0.9) 

1.3) 
2.0) 

1.2) 
1.6) 


13. 0(  1.2) 
216. 6(  4.2) 


39.31  1.1) 

235. 1(  1.2) 

36. 4(  1.1) 

199. 6C  1.6) 

33. 4C  1.5) 

209. 7C  1.7) 

41. 3C  2.9) 

230. 1(  2.5) 


35.  IC 
217. 6( 


0.6) 
1.0) 


35. 3(  1.5) 
192. 3C  2.1) 


37. 7C 
196. 4C 


1.6) 
1.4) 


40. 2C  2.6) 
216. 3(  2.9) 


0.6 
1.4 
2.1 
1.3 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOm 


1126 


1341  5X) 


3650  570C97f  3X) 


6634         1075734(  3X) 


6272         10010151  2X) 


100. OC  0.0) 

200.21  1.2) 

O.Of  CO) 

HMMMlIf  0.0) 

O.OC  0.0) 

umiiiiif  0.0) 

O.Of  0.0) 

mmiiiif  0.0) 


O.Of 
nimiiiif 


0.0) 
0.0) 


100. Of  0.0) 

215. 5f  0.6) 

O.Of  0.0) 

mmiiiif  0.0) 

O.Of  0.0) 

mmiiiif  0.0) 


O.Of  0.0) 

iMimiif  0.0) 

O.Of  0.0) 

mmiiiif  0.0) 

100. Of  0.0) 

227. 4f  1.1) 

O.Of  0.0) 

mmKiif  0.0) 


O.Of  0.0) 

KmiKiif  0.0) 

O.Of  0.0) 

Kimiiiif  0,0) 

O.Of  0.0) 

mmiiiif  0.0) 

100. Of  0.0) 

211. 6f  0.6) 


0.0 
0.0 
0.0 
0.0 


AGE 


9  YEARS  OLD 
10  OR  OLDER 


11420         2016507f  IX) 


6133  77a997f  IX) 


5. Of  0.2) 

205. If  1.1) 

6.3(  0.6) 

192. 6<  2.0) 


19.6f 
219. 4f 


0.6) 
0.9) 


22. 2f  0.6) 
206. 6f  1.&) 

62  V 


40. Of  1.1) 

231. 2f  1.2) 

33. 5f  0.9) 

215. 2f  1.4) 


602 


35. 4f  0.6) 

215. 3f  0.9) 

^6. Of  0.9) 

201  9f  1,1) 


0.7 
1.2 


Table  15(28) 


mmm  msrmse  mtGCtrrAscs  and  sencrai  heaoinb  rRorxcxENCY  mANs  -  RErmtiNB  vARXAeus 

f  MEANS  AM  iASEO  ON  A  SXt«U  UT  OT  PUUSXBU  VALUES  I 
mCENT  AT  m  ABOVE  ANCHOR  POXNTS 


—  TOTAL  — 

SEX 

HALE 

raiAU 

tTHraCXTY/RACE 
MHXTE 
BUCK 
HXSPAHXC 
OTHER 

PARENTAL  EOUCATXOH 
MOT  6RA0UATE0  H.S. 
ORADUATEO  H.S. 
POST  H.S. 
INCNOWl 

ABE 

f  YEARS  OLD 
10  OR  OLDER 


N 

170^0 


9063 
6777 


II76E 
E796 
E469 
795 


I1E6 
3650 
663^ 
6272 


11507 
6203 


NEXGHTED  H 
26377121  0X1 


1412664C  V/.} 
14250471  1X1 


2002M4C  1X1 

4311671  1X1 

3226241  2X1 

614561  4^1 


1661341  5X1 

570097C  3X1 

1075734 C  3X1 

1001015C  2X) 


20317071  OX) 
766337 C  IX) 


150 
96. 2(  0.2) 


9S.3C  0.3) 
97. IC  9.3) 


96. IC  0.2) 

90. 3(  1.1) 

92. 1(  0.7) 

97. 4(  0.5) 


92. 5C  0.9) 

96. 1(  0.3) 

97. 6(  0.3) 

95. 5(  0.4) 


97. 4( 
93. 1( 


0.2) 
0.5) 


200 

66. 3C  0.7) 


65. 1(  0.9) 
71. 5C  0.7) 


75. ec  0.6) 

44. 7(  1.5) 

52. 2(  1.4) 

73. 9C  1.9) 


52. 0(  1.4) 

66. 1(  1.0) 

76. 9(  0.9) 

62. 4(  1.0) 


72. 9(  0.6) 
56«2(  0.6) 


250 
19. 6C  0.7) 


)6.9C 
20. 7( 


0.6) 
0.7) 


24. 6(  0.9) 

6.0(  0.6) 

6.2(  0.6) 

20. 3(  2.4) 


7.0(  0.6) 

16. 3(  0.6) 

26. 6(  1.2) 

14. 7(  0.6) 


22. 4( 
12. 6( 


0.9) 
0.5) 


300 

1.2C  0.1) 


l.K  0.1) 
1.2(  0.2) 


1.5(  0.2) 

0.0(  0.0) 

O.K  0.1) 

2.0(  0.7) 


0.3(  0.2) 

0.7(  0.2) 

2.3(  0.2) 

0.3(  0.1) 


1.2(  0.1) 
0.6(  0.2) 


350 
0.0( 


0.0( 
0.0( 


O.tl 


O.OC  0.01 

0.0(  O.tl 

O.OC  O.tl 

O.OC  O.OI 


O.Ot 

O.OC  O.OI 

O.OC  0.01 

0.01  •••I 


O.OC  O.Ot 
O.OC  O.OI 


628 

603 


Table  15(29) 

NAtr  ifes*«%  RCMiim  Mm  miivm  Asscssmifr  -  student  qucsnoitiAif^E  -  6tn  ckaoers 
Mumroi  msmac  maiirAGCs  and  cciierai  rcadxhd  rrorxcxcicY  means  *  rcportxiic  varxabus 

A9E  DASiD  ON  A  SINDU  SCT  OF  PUUSXBLC  VALUES) 
XNPUTED  STUDENT  SHADE 


TOTAL 


N 

1617S 


WEXGHTED  N 
E761459(  0X1 


GRADE  6 

100. Of  0.01 
260.71  0.51 


X-OMXT 
0.0 


SEX 
NAU 


PETULE 


9066 
9106 


lS00e77(  1X1 
15804771  IX) 


100. Of  0.0) 

E57.0f  0.6) 

100. Of  0.0) 

264. 5f  0.6) 


0.0 


0.0 


ETHNXCXTY/RACE 
MIXTE 


BLACK 

HXSPANXC 

OTHER 


12939 
2555 
2043 
636 


2044476f  OX) 

396196f  IX) 

241169f  EX) 

77593f  3X) 


100. Of  0.0) 

266. 7f  0.6) 

100. Of  0.0) 

240. 7f  1.1) 

100. Of  0.0) 

242. 4f  1.3) 

100. Of  0.0) 

263. 6f  1.6) 


0.0 


0.0 


0.0 


O.G 


PARENTAL  EDUCATION 
NOT  6RADUATED  H.S. 


CRADUATED  H.S. 


POST  N.S« 
UliaiOUN 


1633 


6444 


6117 


1609 


262530f  5X) 

974445 f  3X) 

12t:i623f  EX) 

234214<  5X) 


100. Of  0.^>) 

244. 2f  0.7) 

100. Of  0.0) 

255. 5f  0.7) 

100. Of  0.0) 

271. 6f  0.7) 

100. Of  0.0) 

241. 6f  1.1) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  DID 


14  OR  OLDER 


ERLC 


12043 
5976 


1666096f  OX^ 
652015(  IX) 


100. Of  0.0) 

266.5f  0.6) 

100.0!  0.0) 

247. 7(  0.6) 


0.0 


0.0 


62j 


604 


Table  15(30) 

HAEP    19S1*84  READING  AND  NRITIN6  ASSESSNENT    -    STUDENT  QUESTIONNAIRE    •    8TH  GRADERS 
NEIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  READING  PRDFICIENCY  HEANS  -  REPORTING  VARIABLES 
CHEANS  ARE  DASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

SIUDENT  SEX 


TOTAL 


SEX 
HALE 


FEMALE 


ETNNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOHN 


AGE 

13  YEARS  OLD 


14  OR  OLDER 


N 

16172 

9066 
9106 

12939 
2554 
2043 
636 

1832 
6444 
8117 
1609 

12042 
5976 


NEIGHTED  N 

2761^1551  OX) 


13808771  IX) 
13804771  IX) 

20444781  OX) 

398095C  XX) 

2411891  2X) 

77593C  3X) 

26 2426 C  5X) 

974445(  3X) 

1261623C  2X) 

2342141  5X) 

18879941  OX) 
852015(  IX) 


50. OC 
257.  OC 


FEHALE 


0.5) 
0.6) 


50. OC 
264. 5C 


0.5) 
0.6) 


100. OC  0.0) 
257. OC  0.6) 


O.OC 


0.0) 
0.0) 


50. 4C  0.6) 

263. OC  0.7) 

48. 4C  1.2) 

236. 3C  1.3) 

49. 3C  1.3) 

237. 6C  2.0) 

51. IC  2.2) 

259. 3C  3.1) 


44. OC  1.7) 
240. 3C  1.3) 


49. 3C 
251. OC 


0.8) 
0.8) 


50. 7C  0.6) 

267.8C  0.9) 

55. 6(  1.3) 

241. 2C  1.6) 


47. OC  0.6) 

263. 3C  0.7) 

57. OC  0.7) 

245. 2C  1.0) 


O.OC  0.0) 

ffUffiiffC  0.0) 

100. OC  0.0) 

264. 5C  0.6) 


49.6C  0.6) 

270. 4C  0.8) 

51. 6C  1.2) 

244. 8C  1,4) 

50. 7C  1.3) 

247. OC  1.6) 

48. 9C  2.2) 

268. IC  2.7) 


56. OC  1,7) 

247. 3C  1.1) 

50. 7C  0.8) 

260. OC  0.8) 

49. 3C  0.6) 

275. 9C  0.8) 

44. 4C  1,3) 

242. 2C  1.2) 


53. OC  0.6) 

269. 3C  0.7) 

43. OC  0.7) 

251. OC  1.0) 


x-om; 

0. 

0. 
0. 


630 


605 


Table  15(31) 


HAEP  READIM6  AMD  NRITItIG  ASSESSHENT    •    STUDEKT  QUESTIOriTIAIRE    -    8TH  GRADERS 

HEI6HTED  RESPOHSE  PERCEKTAGES  AND  GENERAL  READING  FRDFICIENCY  MEANS  -  REPORTI^IG  VARIABLES 
(MCAt»  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

EIHHICITY/RACE 


NEI6HTED  N 


NHITE 


BUCK 


HISPANIC 


ANER  im 


ASIAN 


UNCLASS 


Z-ONIT 


—  TOTAL  — 

18173 

2761459( 

OX) 

74.01 
266.71' 

0.2) 
0.6) 

14.41 
240.71 

0.1) 
1.1) 

8.71 
242.41 

0.11 
1.3) 

1.11 
256.21 

0.1) 
2.3) 

1.61 
268.91 

0.2) 
2.3) 

0.01  0.0) 
257.2114.9) 

0.0 

SEX 
MALE 

9066 

1380877( 

IX) 

74.61 
263.01 

0.4) 
0.7) 

13.91 
236.31 

0.4) 
1.3) 

8.61 
237.61 

0.2) 
2.0) 

1.31 
251.31 

0.1) 
3.7) 

1.61 
265.91 

0.1) 
3.5) 

0.01  0.0) 
185.21««««) 

0.0 

FEMALE 

9106 

13804771 

IX) 

73.51 
270.41 

0.3) 
0.8) 

14.91 
244.81 

0.3) 
1.4) 

8.91 
247.01 

0.3) 
1.6) 

1.01 
262.31 

0.2) 
2.9) 

1.71 
271.61 

0.2) 
4.0) 

0.11  0.0) 
264.9113.1) 

0.0 

ETHNICITY/RACE 
WHITE 

12939 

20444781 

OX) 

100.01 
266.71 

0.0) 
0.6) 

0.01 
)i)i)i)i)il 

0.0) 
0.0) 

0.01 

»r  '  '*ifl 

0.0) 
0.0) 

0.01 
1(1(11)1)11 

0.0) 
0.0) 

0.01 
»)()()( )(1 

0.0) 
0.0) 

0.01  0.0) 
)()()(«)(1  0.0) 

0.0 

BUCK 

2555 

398198( 

IX) 

0.01 
11)11111  )(t 

0.0) 
0.0) 

100.01 
240.71 

0.0) 
1.1) 

0.01 

KKKKKl 

0.0) 
0.0) 

0.01 
«)()()( )(1 

0.0) 
0.0) 

0.01 
)()()()()(( 

0.0) 
0.0) 

0.01  0.0) 
)()()()()(1  0.0) 

0.0 

HISPANIC 

2043 

2411891 

2X) 

0.01 
ffffffffffl 

0.0) 
0.0) 

0.01 

0.0) 
0.0) 

100.01 

242.41 

0.0) 
1.3) 

0.01 
)()()( )()(1 

0.0) 
0.0) 

0.01 
«)()()()(( 

0.0) 
0.0) 

0.01  0.0) 
«)()()()((  0.0) 

0.0 

OTHER 

636 

775931 

3X) 

0.01 

)I)I)I)MI1 

0.0) 
0.0) 

0.01 
)l)llllllll 

0.0) 
0.0) 

0.01 

XXKKKl 

0.0) 
0.0) 

40.31 
256.21 

5.0) 
2.3) 

58.41 
268.91 

5.1) 
2.3) 

1.31  0.5) 
257.2114.9) 

0.0 

PARENTAL  EDUCATION 
NOT  GRADUATF.Ci  H.S. 

1833 

2625301 

5X) 

58.21 
250.91 

2.2) 
1.1) 

18.61 
231.11 

1.3) 
2.1) 

20.81 
237.01 

1.9) 
1.7) 

1.61 
251.61 

0.4) 
6.3) 

0.81 
239.21 

0.2) 
7.9) 

0.01  0.0) 
185. 21 »)()(») 

0.0 

GRADUATED  H.S. 

6444 

9744451 

3X) 

76.81 
259.71 

0.7) 
0.8) 

14.41 
239.91 

0.6) 
1.5) 

6.81 
242.61 

0.6) 
2.1) 

1.11 
248.01 

0.1) 
3.6) 

0.91 
257.41 

0.2) 
6.3) 

0.01  0.0) 
235.91  9.4) 

0.0 

POST  H.S. 

8117 

12616231 

2X) 

79.71 
276.11 

0.6) 
0.8) 

12.21 
249.21 

0.4) 
1.9) 

5.11 
256.61 

0.4) 
1.8) 

1.11 
266.31 

0.2) 
3.9) 

2.01 
278.11 

0.3) 
3.2) 

0.01  ul.O) 
290.51  9.5) 

0.0 

UNKNOWN 

1609 

2342141 

5X) 

49.31 
250.81 

2.1) 
1.3) 

22.51 
228.61 

1.6) 
1.8) 

23.01 
231.51 

1.9) 
J. 7) 

1.31 
245.71 

0.3) 
9.0) 

3.91 
259.41 

0.6) 
3.5) 

0.01  0.0) 
)()()()()(1  0.0) 

0.0 

AGE 

13  YEARS  OLD 

12043 

18880981 

OX) 

77.51 
270.91 

0.2) 
0.6) 

12.31 
•^47. 71 

0.1) 
1.4) 

7.01 
249.31 

0.1) 
1.7) 

1.21 
261.81 

0.1) 
3.0) 

1.81 
273.71 

0.2) 
2.9) 

0.01  0.0) 
264.9113.1) 

0.0 

14  OR  OLDER 

5976 

8520151 

IX) 

66.41 
255.51 

0.4) 
0.9) 

18.91 
229.71 

0.4) 
1.5) 

12.51 
233.81 

0.4) 
1.9) 

1.01 
242.11 

0.1) 
3.7) 

1.11 
249.41 

0.1) 
3.6) 

0.01  0.0) 
185.21»»»») 

0.0 

ERIC 


606 


6:^2 


Table  15(32) 

Jof^rn*^!^^®".*^  ^^^^^^^  PROFICIENCY  MEANS  -  REPORTING  VARIABLES 

fHEANS  ARE  BASED  DM  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

REGION 


—  TOTAL 


N  UEIGHTEO  N 

16173         2761^59C  OX) 


NE 

22. 7(  0.^) 
262. 6(  1.0) 


SE 

23. 5C  1.5) 
259.91  1.^) 


CENTRAL 

26. 3(  1.3) 
262. OC  1.2) 


NEST 

27. 5(  0.6) 
256. 5(  0.7) 


X-OHIT 
0.0 


SEX 
HALE 


FEHALE 


9066         13806771  IX) 


9106         1360^771  IX) 


23. 5(  0.6) 

259. 1(  1.1) 

21. 9(  0.6) 

266. 7(  1.^) 


23. 3C  1.^) 

256. ^(  1.^) 

23. 6(  1,7) 

263. ^(  1.7) 


26.01  1.^) 

257. 5(  1.3) 

26. 6(  1.^) 

266. ^(  1.^) 


27.2(  0.9) 

255. 1(  1.0) 

27. 6(  0.6) 

261. 6(  0.6) 


0.0 


0.0 


ETNNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


12939         20^^^76C  OX) 


2555  396196C  IX) 


20^3  2^11691  2X) 


636 


77593 C  3X) 


2^.0(  0.1) 

267. 0(  1.0) 

21. 9(  0.5) 

2V«.3(  3.9) 

1^.6C  3.6) 

2^6. 0(  2.0) 

19. 3(  ^.9) 

265. 3(  ^.P) 


21.51  1.7) 
266. 9(  1.6) 

^3.2(  0.^) 
239. 1(  1.3) 

10. 6(  ^.1) 
2^5. 2(  9.2) 

12. 9(  2.9) 
?61.6(  5.6) 


30. IC  1.6) 
265. 1.3) 

19. 2(  3.2) 

?^0.2C  1.9) 

6.5(  2.3) 

2^1. 6(  2.9) 

19. 6(  3.6) 

261. 6(  3.2) 


2^.^(  0.1) 
265. 9(  0.6) 

15. 7(  3.3) 
2^'*.7(  2.3) 

66. 1(  0.6) 
2^0. 6(  1.6) 

^6.2(  7.0) 
26^. 2(  2.4) 


0.0 


0.0 


0.0 


U.O 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S. 
UNKNOIjN 


1633  2625301  5X) 


6444  9744451  3X) 


6117         1261623(  2X) 


1609  234214C  5X) 


16. 9(  2.3) 

248.01  2.3) 

23. 9(  1.1) 

256. 4(  1.1) 

23. OC  0.9) 

272. 3(  1.2) 

22. 5(  2.7) 

245. 6C  1.3) 


31. 0(  2.2) 
242. 9C  1,0) 

21. 9(  1.7) 
251. 6C  1.3) 

24. 2C  2.2) 
273, 3C  2.0) 

20. OC  2.2) 
240, IC  2.5) 


21. OC  2.6) 

244. 6C  1.6) 

30. 7C  2.0) 

257. 6C  1.4) 

25. 2C  2.0) 

272. OC  0.9) 

22. OC  2.3) 

244. 6C  2.5) 


29. IC  3.0) 

242. 9C  1.1) 

23. 5C  1.5) 

253. 5C  1,4) 

27. 6C  1.2) 

269. 6C  1,1) 

35. 5C  2.6) 

236. OC  1.9) 


0.0 


0.0 


0*0 


0.0 


AGE 

13  YEARS  OLD 


14  OR  OLDER 


12043         1666098C  OX) 


5976  6520151  IX) 


22. 6C  0.3) 

269. IC  1.0) 

21. 6C  0.7) 

246. 2(  2.0) 


23. IC  1.7) 

267. 8C  1.5) 

24. 4C  1.4) 

243. 2C  1.9) 


26. 6C  1.4) 

265. 6C  1.3) 

25. 5C  1.5) 

253. OC  1.7) 


27. 3r  0.5) 

264. IC  0.6) 

26. 3C  1.1) 

246. 4C  1.4) 


0.0 


0.0 


633 


607 


Table  15(33) 

HAEP    1983-84  READING  AIH  NRITItlG  ASSESStlEKT    -    STWEKT  QUEST lOTIMAIRE    -    6TH  (SRADERS 
NEIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  READING  rPOFICIENCY  HEAI4S  -  REPORTING  VARIABLES 
tNEANS  ARE  BASED  Oil  A  SINGLE  SET  OF  PLAUSIBLE  VALUES) 

IHPUTED  STUDENT  AGE 


—  TOTAL  — 


N 

18173 


UEIGNTED  N 
27614591  OX) 


11-LESS 

0.0(  C.OI 
249.8(15.9) 


12 

0.8( 
269.01 


0.1) 
4.3) 


13 

68. 4C  0.2) 
266. 5(  0.6) 


14 

26. 1( 
250. 7( 


0.2) 
0.7) 


15 

4.0C 
231. 7( 


0.2) 
1.8) 


16*N0RE 


0.8( 
227. 9C 


0.1) 
3.0) 


X-ONIT 
0.0 


SEX 
HALE 


FEMALE 


9066 
9106 


13808771  IX) 
13804771  r/.} 


0.0( 


G.O) 
0.0) 


0.0(  0.0) 
249.8(15.9) 


0.6(  0.1) 

259. 3(  5.3) 

0.9(  0.2) 

275. 5(  4.7) 


64. 2(  0.4) 

263. 3(  0.7) 

72. 5C  0.4) 

269. 3(  0.7) 


29. 3(  0.4) 

248. IC  1.1) 

22. 9(  0.4) 

254. 1(  1.1) 


5.0C  0.3) 
232. 7(  2.1) 


3.1( 
230. 2( 


0.2) 
3.0) 


0.9(  0.2) 
222. 6(  4.5) 


0.6( 
236. 2C 


0.1) 
5.1) 


0.0 


0.0 


ETHNICITY/RACfc 
MUTE 


BUCK 

HISPANIC 

OTHER 


12939 


2555 


2043 


636 


20444781  OX) 

398198<  IX) 

2411891  2X) 

775931  3X) 


0.0(  0.0) 
249.8C15.9) 


O.OC 
mmiiiiC 

O.OC 
0.0( 


0.0) 
0.0) 

0.0) 
0.0) 

0.0) 
0.0) 


0.7(  0.1) 

271. 9(  5.3) 

1.2(  0.3) 

265. 3(  9.1) 

0.5(  0.2) 
239.3(14.1) 

1.3(  0.4) 

285. 6(  9.1) 


71. 6(  0.2) 

270. 9(  0.6) 

5C.5(  0.6) 

247. 7(  1.4) 

55. 2{  0.9) 

249. 3(  1.7) 

74. 9(  1.1) 

269. OC  2.2) 


24. 7( 
257. 3( 


0.3) 
0.9) 


28. 7(  0.8) 

232. 4(  1.7) 

35. 1(  1.1) 

236. 8(  1.3) 

19. 8(  1.2) 

247. 8(  2.5) 


2.6( 
240. 9( 


0.2) 
2.3) 


9.0(  0.7) 
222. 5(  2.5) 


7.8( 
222. 8( 


1.1) 
3.8) 


3.5(  0.8) 
235.11  5.3) 


0.3(  0.1) 

233. 7(  4.2) 

2.6(  0.5) 

226. 1(  4.2) 

1.3(  0.6) 

220. 7(  8.0) 

0.5(  0.3) 
232.0(33.7) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADNATEO  H.S. 
POST  H.S. 
UNKNDNN 


1833 
6444 
8117 
1609 


2625301  5X) 

974445 (  3X) 

12616231  2::/ 

2342141  5X) 


O.OC 
««i«i«i«i( 


0.0) 
0.0) 


0.0(  0.0) 
249.8(15.9) 


0.0( 
«<i<i»<i( 

0.0( 

llll«IM«l( 


0.0) 
0.0) 

0.0) 
0.0) 


0.6(  0.2) 
253.4(17.8) 

0.7(  0.2) 

259. 8(  7.1) 

0.8(  0.2) 

278. 7(  4.8) 

0.8(  0.2) 

260. 0(  8.2) 


52. 9(  1.2) 

252. 0(  0.8) 

68. 0(  0.7) 

260. 4(  0.7) 

74. 2(  0.5) 

275. 2(  0.7) 

56. 2(  1.7) 

250. 0(  1.3) 


34. 3(  1.1) 

238. 5(  1.7) 

26. 6(  0.7) 

246. 6(  1.2) 

22. 7(  0.5) 
1.2) 

32. 6(  1.3) 

232. 9(  1.7) 


10. 0( 
226. 4( 

4.0( 
235. 6( 

1.9( 
240. 6( 

8.6( 
220.  7( 


1.0) 
2.5) 

0.4) 
3.1) 

0.1) 
3.0) 

1.0) 
3.9) 


2.2(  0.6) 

224. 5(  5.3) 

0.7(  0.2) 

229. 1(  5.1) 

0.3(  0.1) 

234. 3(  5.7) 

1.7(  0.4) 

230. 0(  5.8) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  OLD 


14  OR  OLDER 

0*534 

ERIC 


12043 


5976 


1880098(  OX) 
852015(  IX) 


0.0( 
iiiiiiiiii( 

0.0( 
llllllllll( 


0.0) 
0.0) 

0.0) 
0.0) 


0.0(  0.0) 
«iiiiiiiii(  0.0) 


0.0( 
«llllllll( 


0.0) 
0.0) 


100. 0(  0.0) 

266. 5(  0.6) 

0.0(  0.0) 

«««««(  0.0) 


608 


0.0(  0.0) 

iiiiiiiiii(  0.0) 

84. 5(  0.9) 

250. 7(  0.7) 


0.0( 
iiiiiiiiii( 


0.0) 
0.0) 


13. 1(  0.6) 
231. 7(  1.8) 


0.0( 

K«lllll( 


0.0) 
0.0) 


2.5(  0.4) 
227. 9(  3.0) 


0.0 


0.0 


635 


Table  15(34) 

SIZEATYPE  OF  CCmJNITY 


TOTAL  — 


N  HEIGHTEO  N 

18173         276145 9C  0/,} 


RURAL 


0-<S  UPB         AOV  URB         BIS  CITY 


FRINGE 


NEOIltf 


SHALL 


SEX 
HALE 


FEHALE 


9066 


ETHNICITY/RACE 
MHITE 


BUCK 

HISPANIC 

OTHER 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S* 


POST  H.S. 
UTKNOHN 


AGE 

13  YEARS  OLD 


14  OR  OLDER 


12939 
2555 
2043 
636 

1633 
6444 
6117 
1609 

12043 
5976 


1360677(  iXi 


9106         1360477C  V/.l 


2044476(  O;;) 

396196(  i;;) 

241169(  Z'/J 

77593t  3X) 

2625 30 (  5X) 
974445C  3X) 
1261623C  2X) 
234214(  5X^ 

16601 ^6(  OX) 
652015C  1X1 


5.1C  1.1)  9.2C  1.6)  10. 7C  2.3)  10. 3C  3.4)  16. 2C 
265. IC  2.1)  245. 9C  2.2)  260. 2C  3.2)  261. 5C  2.0)  265. 9C 


3.0)  15»3t  2»6)  3>.3t  2.3) 
1.6)  263. 3(  2.6)  265. 2C  1.0) 


.4:5!  LI!     L'! IV,  .Ail  IV,     l.V,  .S:?! 

Ill  e^;;;!  .IZ IV,  tV,  ,4:1!  IV,  ,l^Z  IV,  .^Z  IV, 
«5:;!  ',V,     IV,  u\Z  \V,     IV,  d',Z  \V        .ItZ  IV, 


2.7t  0.9)  17.lt  3.7)  13. 6C 
245. Ot  5.6)  247. 6t  4.9)  263. 4C 


3.4)  13. 6C  4.4)  20. 5C  4.4)  11. 7C  3.0)  20. 6C  3.6) 
2.4)  265. 9C  6.6)  271. 5(  4.0)  262. 6C  5.6)  257. IC  3.2) 


6.3t  1.6)  12. 6(  2.6)  i.6(  0.5)  7.7(  2.7)  12.7( 
246. 2(  2.6)  235. 5(  2.7)  251. 9(  6.2)  240. 2C  3.0)  244.2( 

•J*2I  ^'^^      ^'^^  ^'^^  10. 2(  3.21    16. 2C 

254. 7t  2.1)  241. 5(  2.6)  265. 9C  2.1)  254.6t  2.0)  256.5C 

4. It  1.0)  6.5(  1.1)  16. 6C  3.4)  11. 2C  4.0)  17. 5( 
274.6C  2.6)  251. 2t  2.4)  261 .6C  2.9)  266.7C  1.6)  272.0( 

4.6t  1.3)  19. 2(  4.4)  5.7t  1.1)  10. 7C  3.3)  18  3C 
244. 7t  5.6)  230. 9(  3.3)  259. 2C  3.4)  244. 4C  2.1)  245. OC 


3.6)  17. 9(  5.2)  41. 0(  2.9) 
2.1)  242. 9C  2.7)  247. 7C  1.2) 

3.1)  13.6C  2.7)  39. 7C  2.6) 
1.0)  256. OC  1.3)  257. IC  1.2) 

3.0)  15. 6C  2.5)  26. 5C  2.5) 
1.3)  271. 6C  1.5)  272. 2C  0.9) 

3.6)    17. OC  4.5)    24. 5C  2.4) 

3.2)  237. 3C  4.9)  244. 6C  2.3) 


5.1C  1.0)  7.9C  1.4)  12. OC  2.6)  10. 7C  3.6)  16  7C 
265. OC  1.7)  246.1C  2.7)  280. OC  2.6)  263. 2C  1.6)  266. 7C 


3.1)  14. 4C  2.4)  33. IC  2.2) 
1.4)  267. 3C  2.0)  266. 9C  0.9) 


5.7C 
249. 7C 


1.3)  10. 6C  2.0)  7.3C  1.7)  9.7C  3.1)  16. OC  3.1)  17. OC  3  4)  33  51  9  ci 
3.7)  230.2C  1.6)  265.5C  3.4)  247.0C  2.1)  251.1C  2.1)  24;:5C  3.1)  24".1c  M! 


X-OHIT 
0.0 

0.0 
0.0 

0.0 
0.0 
0.0 
0.0 

0.0 
0.0 
0.0 
0.0 

0.0 
0.0 


636 


609 


637 


Table  15(35) 

NAEP   1983-64  READING  MO  NRXTING  ASSESSTIEKT    -    STtJDENT  QUESTIOMMAXRE    -    6TH  GRADERS 
HEXGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  READXNG  PROFXCXENCY  NEANS  -  REPORTXNG  VARIABLES 
(MEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSXBLE  VALUES) 

PARENTAL  EDUCATION 


—  TOTAL  — 


16003 


WEIGHTED  N 


2732612C  IX) 


NOT  HS 

9.6C  9.5) 
244*2(  0.7) 


GRAD  HS 

35. 7C  1.0) 
255. 5C  0.7) 


POST  HS 

46. 2C  1.1) 
271. 6C  0.7) 


UTKNONN 

6.6C  0.4) 
241. 6(  1.1) 


Z-OhIT 
1.0 


SEX 
tULE 


FEMALE 


6975 


9027 


1366061(  2X) 


1366647C  r/J 


6.4C  0.6) 

240. 3C  1.3) 

10. 6C  0.5) 

247. 3(  1.1) 


35. 2C  1.0) 

251, OC  Oc6) 

36. IC  1.1) 

260. OC  0.6) 


46. 6(  1.1) 

267. 6C  0.9) 

45. 5C  1.3) 

275. 9(  0.6) 


9.5C  0.5) 

241. 2C  1.6) 

7.6C  0.4) 

242. 2(  1.2) 


1.1 


1.0 


ETHNICITY/RACE 
WHITE 


BUCK 

HISPANIC 

OTHER 


12612         2021763(  IX) 


2537  395346 (  IX) 


2026  239352(  2X) 


626 


76331t  3'<) 


7.6C  0.5) 

250. 9C  1.1) 

12. 3C  0.9) 

231. IC  2.1) 

22. 6C  2.1) 

237. 0(  1.7) 

6.2C  1.5) 

246. 5(  5.0) 


37. OC  1.1) 

259. 7C  0.6) 

35. 5C  1.5) 

239. 9C  1.5) 

27. 9C  2.7) 

242. 6C  2.1) 

25. 4C  2.4) 

252.01  4.3) 


49. 7C  1.4) 

276. IC  0.6) 

36. 6C  1.3) 

249. 2C  1.9) 

26. 6C  2.1) 

256. 6C  1.6) 

50. 4C  2.6) 

274. 1(  2.2) 


5.7C  0.3) 

w50.6C  1.3) 

13. 3C  1.1) 

226. 6C  1.6) 

22. 5C  2.2) 

231. 5C  1.7) 

15. 9C  1.6) 

u56.1(  3.2) 


1.1 
0.7 
0.8 
1.6 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOWN 


1633  262530 C  5X) 


6444  974445(  3X) 


6117         1261623(  2X) 


1609  234214(  5X) 


100. 0(  0.0) 

244. 2(  0.7) 

0.0(  0.0) 

tttmimC  0.0) 


O.OC  0.0) 

imtiKiiC  0.0) 

O.OC  0.0) 

KMMIIKC  0.0) 


O.OC  0.0) 

KIDdlKC  0.0) 

100. OC  0.0) 

255. 5C  0.7) 

O.OC  0.0) 

miwiiiiC  0.0) 


»0C  0.0) 
KMC  0.0) 


O.OC  0.0) 

K4IIIIIIC  0.0) 

O.OC  0.0) 

miKiiiiC  0.0) 

100. 0(  0.0) 

271. 6C  0.7) 

O.OC  0.0) 

iDdmiiC  0.0) 


O.OC  0.0) 

mmiiiiC  0.0) 

O.OC  0.0) 

miiiiiiiC  0.0) 


U.OC  0.0) 

ff^KimC  0.0) 

100. OC  0.0) 

241. 6C  1.1) 


0.0 
0.0 
0.0 
0.0 


AGE 

13  YEARS  OLD 


11932         1666600C  IX) 


7.4C  0.4) 
252. OC  0.6) 


35. 4C  1.1) 
260. 4C  0.7) 


50. IC  1.2) 
275. 2C  0.7) 


7.0C  0.3) 
250. OC  1.3) 


1.0 


14  OR  OLDER 


5916  643042C  IX) 


14. 5C  0.6)  36. 2C  1.2)  37. 4C  1.2)  11. 9C  0.6) 
235. 2C  1.2)      244. 6C  1.1)      261. 3C  1.2)      230. 4C  1*7) 


1.1 


ERLC 


63a 


610 


Table  15(36) 

KAEP    1963-64  ntMjm  AND  HRITIMS  ASSESStfEKT    -    STUOEKT  qUESTIOftlAIRE    -    dTM  GRADERS 
HEIGHTED  RESP0H3E  PERCEWTAGES  AND  GENERAL  READING  PRDFICIENCY  MEANS  -  REPORTING  VARIABLES 
(NEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

PERCENT  AT  OR  ABOVE  ANCHOR  POINTS 


—  TOTAL  — 

SEX 
HALE 
FEMALE 

rrHNICITY/RACE 

iihIte 

BUCK 

HISPANIC 

OTHER 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 
GRADUATED  H.S. 
POST  H.S. 
UOCNOl^N 

AGE 

13  YEARS  OLD 

14  OR  OLDER 


N 

16173 


9066 
9106 


12939 
2555 
2043 
636 


1633 
6444 
6117 
1609 


12043 
5976 


HEIGHTED  N 
2761459C  0X1 


13606771  1X1 
13604771  1X1 


2044476C  0X1 

3961961  1X1 

241169C  2X1 

77593C  3X) 


262530C  5X) 

974445 C  3X1 

1261623C  2X1 

234214(  5X1 


1866096(  0X1 
652015C  IX) 


150 
99.61  0.0) 


99. 6C  0.1) 
99. ?(  0.0) 


99. 9(  0.0) 

99. 4(  0.2) 

99. 6(  0.1) 

100. OC  0.0) 


99. 7C  0.2) 

99. 9(  0.0) 

99. 9(  0.0) 

99. 2C  0.2) 


100.01  0.0) 
99.5C  0.1) 


200 
95. 5C  0.2) 


94. 0(  0.2) 
96. 9C  0.3) 


97. 4(  0.2) 

69. 6(  0.6) 

86. 6(  1.2) 

95. 5(  0.7) 


91. 6(  0.5) 

95. 1(  0.4) 

96. 0(  0.1) 

86. 3(  0.6) 


97. 4(  0.2) 
91. 1(  0.4) 


250 
63. 1(  0.7) 


59. 1(  0.6) 
67. 1(  0.6) 


69. 9(  0.6) 

40. 0(  1.5) 

42. 0(  1.7) 

66. 6(  2.3) 


43. 6(  1.1) 

57. 5(  1.0) 

75. 4(  0.6) 

41.2(  1.4) 


69. 9(  0.6) 
47. 7(  1.0) 


300 
12. 5(  0.4) 


10. S(  0.5) 
14. 4(  0.4) 


15. 3(  0.5) 
2.7(  0.4) 
3.9(  0.5) 

13.91  1.9) 


4.0(  0.4) 
6.0(  0.4) 
19. 3(  0.7) 
4.31  0.5) 


14. 8(  0.5) 
7.11  0.5) 


350 
0.31  O.OI 


O.lf  0.01 
0.51  0.11 


0.4(  O.ll 

0.0(  0.01 

O.K  0.11 

0.21  0.21 


0.0(  0.01 

O.K  0.11 

0.6C  O.ll 

O.K  O.ll 


0.5( 
O.OC 


O.ll 
OOl 


639 


Table  15(37) 


NAEP    1981-94  READING  AND  NRITINB  ASSESSHENT    -    STUDENT  QUESTIOtmiRE    -    IITH  GRADERS 
MCIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  READING  PROFICIENCY  NEANS  -  REPORTING  VARIABLES 
inCAIIS  ARE  BASED  ON  A  SINGLE  SET  OF  PUU3IBLE  VALUES  I 

XHPIfTED  STUDENT  GRADE 


TOTAL 


N  NEIGHTED  N 

19060         2563822C  OX^ 


GRADE  11 

100. 0(  0.01 
269. 3(  0.63 


X-OHIT 
0.0 


SEX 
HALE 


FEIMLE 


9443         1292364C  2X1 


%37         12714571  2X) 


100. 0(  0.01 

264. 5(  1.01 

100.01  O.OI 

294. 3<  0.91 


0.0 


0.0 


ETHNICITt/RACE 
WHITE 


BUCK 

HISPANIC 

OTHER 


13914         1906547C  OX  I 


2792  363493C  1X1 


1699  203453C  2X) 


675 


70329C  3X) 


100. 0(  0.01 

295. 6(  0.91 

100. 0(  0.01 

268. 1(  1.61 

100.01  0.01 

269.5'  2.0) 

100.01  0.0) 

267. 6(  2.2) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S, 
POST  M.S. 


UNKNOUN 


2300  2934561  5X) 


6600  665215C  3X) 


9376         13016031  3X) 


596 


76693C  5X) 


100.01  0.0) 

269.5(  1.2) 

100.01  0.0) 

261. 6(  0.7) 

100. 0(  0.0) 

300. 6C  0.9) 

100.01  0.0) 

259. 2C  2.1) 


0.0 


0.0 


0.0 


0.0 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


18  OR  OLDER 


ERIC 


1992  3340111  6X) 


14009         1699663C  OX? 


3079  530126C  3X) 


100. OC  0.0) 

299. 6C  1.4) 

100. OC  0.0) 

295. OC  0.7) 

100. OC  0.0) 

264.5C  1.3) 


0.0 


0.0 


0.0 


640 


612 


Table  15(38) 


NAEP  nEAOWG  MO  MRXTIfIG  ASSESSWEKT    -    STUDEIIT  QUESTIWIHAXRE    -    HTH  GRADERS 

NETGHTEO  RESPOMSE  PERCEMTA6E8  AlO  GEtlERAL  REAOXriG  PROFXCXENCY  MEANS  -  REPORTXMC  VARXABLES 
(HEAIIS  ARE  BASED  ON  A  SXN6LE  SET  DF  PUUSX8LE  VALUES) 

STUDENT  SEX 


—  TOTAL  — 


SEX 
HALE 


FEMALE 


ETHNICXTY/RACE 
HHXTE 


BLACK 

HXSPANXC 

OTHER 


PAREr^AL  EnUCATIOtI 
NOT  GRADUATED  M.S. 


GRADUATED  H.S. 
POST  H.S. 


AGE 

16  OR  YOU^ER 


17  YEARS  OLD 

18  DR  OLDER 


ERIC 


N 

WEIGHTED  N 

MALE 

FEMALE 

X-ONXT 

19080 

2563822C 

OX) 

50. 4C 
284. 5C 

0.8) 
1.0 ) 

49. 6C 
294. 3r 

0.8) 
0.9) 

0.0 

1292364C 

Z/.} 

100. OC 
284. 5C 

0.0) 
1.0) 

O.OC 

0.0) 
0.0) 

0.0 

1271^57C 

zy.} 

o.oc 

0.0) 
0.0) 

100. OC 
294. 3C 

0.0) 
0.9 ) 

0.0 

19065^7C 

O/i) 

50. 3C 
290. 5C 

0.8) 
1.0) 

40  7f 
301. IC 

n  ft  1 
0.9) 

0.0 

2792 

383^9 3( 

IX) 

49. 2C 
264. 3C 

1.3) 
2.2) 

50. 8C 

1.3) 

0.0 

1699 

2034531 

2X) 

51. 3C 
266. 2C 

1.7) 
2.2) 

48. 7( 
273. 0( 

1.7) 

9  ft  1 

0.0 

675 

70329( 

3X) 

56. 9( 
282. 6( 

2.9) 
2.8) 

43.  IC 
294. 3( 

2.9) 
3.3) 

0.0 

2300 

293''«58( 

45. 6( 
264. 5( 

0.9) 
1.5) 

54. 4C 
273. 7i 

0.9) 
1.6) 

0.0 

6600 

865215( 

50. 0( 
275. 9( 

0.8) 
1.0) 

50. OC 
287. 7C 

0.8) 
n  Ai 

0.0 

9378 

1301603( 

3X) 

51. 2C 
295. 9( 

1.2) 
I.l) 

48. 8( 
305 .6( 

1.2) 
1  n  1 

O.D 

596 

7e693( 

s/.^ 

56. 5C 
258. 4C 

2.8) 
2.8) 

43. 5( 
260. 3( 

2.8) 
3.3) 

0.0 

1992 

334011( 

6<) 

43. 5C 
295. 2C 

1.7) 
1.9) 

56. 5C 
303. 1( 

1.7) 
1.6) 

0.0 

K009 

1699683( 

OX) 

48. 2( 
291. IC 

0.8) 
0.9) 

51. 8C 
298. 7C 

0.8) 
0.8) 

0.0 

3079 

530128C 

3X) 

61.9; 
263.31 

1.3) 
1.3) 

38. IC  1.3) 
266. 6C  1.7) 

641 

0.0 

613 

r 


Table  15(39) 

mtP    1963-64  REAOiriO  AKD  HRimiG  ASSESSTIEKT    -    STUDENT  QUEST lONIIAIRE    -    IITH  GI9ADERS 
NCI6HTED  RESPdlSE  PERCEKTAGES  AND  GENERAL  READING  rROFICIEIICY  MEANS     REPORTING  VARIABLES 
tHEANS  ARE  BASED  OH  A  SINGLE  SET  OF  PUUSIBLE  VALUES! 


IIE6I0N 


N 

NEIGItTED  M 

NE 

SE 

CENTRAL 

MEST 

X-OWIT 

—  TOTAL  — 

19060 

25636221 

24. 5( 
290. 9( 

0.4) 
2.7) 

22. 0( 
267. 3( 

1.9) 
1.7) 

27. 4( 
290. 7( 

1.6) 
1.7) 

26. 1( 
266. 2( 

0.6) 
0.6) 

0.0 

SEX 

9443 

1292364( 

25. 6( 
266. 3( 

0.6) 
2.4) 

21. 2C 
262. 7( 

1.6) 

^.6) 

26. 6( 
265. 6( 

1.6) 
1.7) 

26. 3( 
262. 6( 

1.0) 
1.5) 

O.t 

FEHALE 

9637 

1271457( 

23. 4( 
295. 9( 

0.6) 
3.2) 

22. 6( 
291. 6( 

2.1) 
1.0) 

26. 0( 
295. 5( 

2.1) 
2.0) 

25. 6( 
293. 7( 

1.1) 
1.3) 

0.0 

ETftNICITY/RACE 
MUTE 

13914 

1906547( 

OX) 

25.91 
296. 0( 

0.1) 
2.6) 

19.51 
297. 6( 

2.1) 
1.7) 

31. 3( 
294. 7( 

2.1) 
1.3) 

23.21 
295. 2( 

0.1) 
1.3) 

O.t 

III  Ant 

2792 

3634931 

1X1 

24. 5( 
272. 6( 

0.5) 
5.7) 

42. 4( 
264. 7( 

0.5) 
1.5) 

17. 6( 
266. 1( 

3.5) 
3.0) 

15. 3C 
272. 6( 

3.7) 
3.9) 

0.0 

203453( 

2X) 

13.91 
264. 0( 

5.2) 
2.2) 

11. 2(  7.4) 
273.5(14.5) 

10. 4( 
265. 0( 

5.2) 
5.6) 

64. 5( 
270. 7( 

1.0) 
2.2) 

nniPD 

70329( 

3X) 

16. 6( 
264. 5( 

2.5) 
5.7) 

9.2( 
294. 1( 

2.3) 
3.9) 

22. 1( 
260. 5( 

3.7) 
6.3) 

50. 1( 
290. 6( 

4.1) 
2.4) 

0.0 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 

2300 

29345d( 

S/.\ 

21. 6( 
272. 1( 

2.7) 
2.9) 

26. 1( 
265. 5( 

2.4) 
2.4) 

22. 3( 
270. 9( 

3.0) 
2.6) 

27. 7( 
270. 3( 

2.6) 
2.1) 

0.0 

6600 

6652151 

25. 0( 
262. 5( 

1.9) 
1.4) 

22. 4( 
276. 9( 

1.9) 
1.6) 

32. 2( 
264. 6( 

2.5) 
1.2) 

20. 5( 
279. 7( 

1.3) 
1.4) 

o.t 

1301603( 

25. 4( 
301. 7( 

1.7) 
3.1) 

20. 1( 
301. 3( 

2.b) 
1.6) 

25. 5( 
301. 5( 

2.2) 
1.3) 

29. 0( 
296. 5( 

1.4) 
1.1) 

0.0 

UTKNOUN 

596 

76693( 

5X1 

21. 5( 
256. 3( 

2.6) 
4.6) 

20. 4( 
256. 0( 

3.2) 
4.6) 

21. 6( 
i>^1.0( 

4.0) 
6.5) 

36. 2( 
259. 4( 

2.7) 
3.7) 

0.0 

AGE 

16  OR  YOUNGER 

1992 

334011( 

6X1 

33.51 
300. 7( 

2.7) 
2.6) 

26. 3( 
296. 9( 

3.7) 
2.5) 

17. 6( 
299. 4( 

3.4) 
3.3) 

22. 5( 
299.01 

2.1) 
2.6) 

0.0 

17  YEARS  OLD 

14009 

1699663( 

OX) 

23. 5( 
295. 5( 

0.3) 
2.2) 

20. 5( 
294. 5( 

1.6) 
1.6) 

29. 4( 
296. 2( 

1.6) 
1.3) 

26. 6( 
293. 6( 

0.5) 
0.6) 

0.0 

ERIC™  oiow 

307i> 

530120! 

3X) 

22.01 
265.6t 

1.6) 
4.41 

24.21 
259.61 

2.2) 
1.71 

27.11 
266.11 

3.2) 
2.71 

26.71 
264.51 

1.7) 
1.3) 

0.0 

614 


Table  15(40) 


HAW  READING  Al«  NRITIUS  ASSESSMEMT    -    STUDEKT  WESTIWtWIRE    -    IITH  GRADERS 

NEIGHTEO  RESPONSE  PERCENTAGES  AND  GENERAL  READING  PROFICIENCY  MEANS  -  REPORTING  VARIABLES 
(HEMIS  ARE  BASED  OH  A  SINGLE  SET  OF  PLAUSIBLE  VALUES) 

miHICITY/RACE 


N 

NEI6HTED  N 

HHITE 

BUCK 

HISPANIC 

ANER 

IHD 

ASIAN 

UNCUSS 

—  TOTAL  — 

19080 

25638221 

OX) 

74. 4( 
295. 8< 

0.2) 
0.9) 

15. Ot 
268.lt 

0.1) 
1.8) 

7.9t 
269. 5t 

0.2) 
2.0) 

0.8t 
288.lt 

0.1) 
3.8) 

1.9t 
287. 2t 

O.I) 
3.2) 

0  Of   0  0  1 
314.6tl8.1) 

SEX 

HALE 

9443 

1292364C 

2X) 

74. 2( 
290. 5( 

0.4) 
1.0) 

14. 6t 
264. 3t 

0.4) 
2.2) 

8. It 
266. 2t 

0.3) 
2.2) 

l.Ot 

284. &( 

0.1) 
3.8) 

2.1C 
281. 4t 

0.1) 

O.ot  0.0) 
329.9t»»iHf ) 

FEHALE 

9637 

1271457C 

2X) 

74. 5t 
301. 1( 

0.4) 
0.9) 

15. 3t 
271. 8t 

0.4) 
1.9) 

7.8t 
273. Ot 

0.2) 
2.8) 

0.6t 
294. 2t 

0.1) 
5.6) 

1.8( 

9  OA  1  f 

0.2) 

cot  0.0) 
30d.2t20.2) 

CTW4ICITY/RACE 

HHITE 

13914 

19065471 

OX) 

100. Ot 
295. 8t 

0.0) 
0.9) 

O.ot 

)i)i)i)i)it 

0.0) 
0.0) 

O.ot 

)i)i)iiiiit 

0.0) 
0.0) 

O.ot 

0.0) 
0.0) 

O.ot 

IfUKKKt 

0.0) 
0.0) 

O.ot  0.0) 
nnimiit  0  0 ) 

BUCK 

2792 

3d3493f 

IX) 

o.ot 

0.0) 
0.0 ) 

100. Ot 
268.lt 

0.0) 

1   A  1 

O.ot 

y%  ^ 

0.0) 
n  n  1 

n  n  f 
D.  Dt 

K^CIflflft 

0.0 ) 
0.0) 

O.ot 

HHKIfllt 

0.0) 
0.0 ) 

O.ot  0.0) 
ffimiiiit  0.0) 

HISPANIC 

XD  TT 

203453f 

2X) 

O.ot 

0.0) 
0.0) 

O.ot 

0.0) 
0.0 ) 

100. Ot 
269. 5t 

0.0) 
2.0) 

O.ot 

KKKKKt 

0.0) 
0.0) 

O.ot 

0.0) 
0.01 

O.ot  0.0) 
KKimiit  0.0) 

OTHER 

675 

70329f 

3X) 

O.ot 

)l)()()(llt 

0.0) 
0.0) 

O.ot 

KKKKKt 

0.0) 
0.0) 

O.ot 

0.0) 
0.0) 

28. 5t 
288.lt 

2.5) 
3.8) 

70.91 

9A7  9f 

CO / . CI 

2.5) 

T  9  1 
^  .  C  1 

0.7C  0.3) 
314.6tl8.1) 

PARENTAL  EDUCATION 

NOT  GRADUATED  H.S. 

2300 

2934581 

5X) 

52. ot 
277. 6t 

2.6) 
1.6) 

21. 4t 

259. Ot 

1.9) 
2.6) 

23. 9t 

261. 8t 

2.5) 
1.6) 

0.9t 
259. 4t 

0.2) 
9.9) 

1.8t 
268. 3t 

0.3) 
6.1) 

O.ot  0.0) 
ffimiiift  0.0) 

GRADUATED  H.S. 

6600 

8652151 

3X) 

75. 5t 
287.lt 

0.8) 
0.7) 

16. 6t 

262>4t 

0.6) 
2.0) 

6.0t 
268. 7t 

0.6) 
2.5) 

0.9t 
287. 5t 

0.1) 
5.7) 

l.OC 
273. 2t 

0.2) 
5.6) 

O.ot  0.0) 
266.0t«««ii) 

POST  H.S. 

9378 

13016031 

3X) 

80. 3t 
304. 6t 

0.7) 
1.0) 

11. 9t 
279. 7t 

0.6) 
1.9) 

4.7t 
286.lt 

0.6) 
2.0) 

o.-'C 

298. 5t 

0.1) 
2.8) 

2.3t 
299. 3t 

0.1) 
2.9) 

O.ot  0.0) 
325. 3t  16.1) 

UKNONN 

596 

788931 

5X) 

42. 4t 
272. 2t 

3.0) 
3.1) 

25. 6t 
249. 8t 

2.4) 
4.3) 

24. 8t 
247. 2t 

2.7) 
2.2) 

0.7t  0.4) 
253.5tl9.3) 

6.6t 
258. 5t 

1.0) 
5.5) 

O.ot  0.0) 
0.0) 

X-WIIT 
0.0 

OaI 
0.0 

0.0 
0.0 
0.0 
0.0 

0.0 
0*0 
0.0 
0.0 


AGE 

16  OR  ''OUNGER 


17  YEARS  OLD 

18  OR  OLDER 

ERIC  643 


1992  334011t  6X) 

14009  1699683t  OX) 
3079  530128t  3X) 


74. 9t 
305.lt 


1.7) 
1.5) 


78. 7t  0.2) 
299. 5t  0.7) 


60. Ot 
272. 8t 


1.1) 
1.7) 


16. 6t 
279. 6t 


1.5) 
2.8) 


12.lt  0.2) 
274. 6t  1.8) 


23. Ot 
252.lt 


1.0) 
2.1) 


5.8t  1.2) 

283. 5t  3.4) 

6.6t  0.2) 

279. 4t  2.3) 

13. 5t  0.7) 

250. 2t  1.9) 
615 


0.3t  0.1) 
287. 7t  11. 6) 


0.9f 
293. 6t 

0.8t 
269. 7t 


0.1) 
3.6) 

0.2) 
9.5) 


2.4t  0.3) 

308.lt  4.2) 

1.7t  0.1) 

295. 7t  2.8) 

2.6t  0.3) 

257.9t  5.5) 


O.Cf  0.0) 
329.9t»iMH() 

O.OC  0.0) 
308.2t20.2) 

O.ot  0.0) 
ffimiiift  0*0) 


0.0 


0.0 


0.0 


644 


Tabic  15(41) 


NAEP  READING  WW  NRITIN6  A85E5SHCNT    -    STUDEWT  qUESTIWtlAIRE    -    llTll  GRADERS 

NEXGHTEO  RESROtlSE  PERCENTAGES  AW  GENERAL  READING  PROFICIENCY  HEANS  -  REPORTING  VARIABLES 
CNEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

mPUTED  STUDENT  AGE 


TOTAL  — 


N  NEIGHTED  N 

19080         2563822(  OX) 


15-LESS 


0.2( 
303.11 


0.0) 
7.7) 


16 

12. 8( 
299. 6( 


0.  8) 

1.  ^) 


17 

66. 3C 
295. 0( 


0.2) 
0.7) 


18 

17. 8(  0.6) 
266. 6(  1.^) 


19 

ZAi  0.2) 
253. 1(  2.0) 


20-NORE 


0.5( 
24^. 6( 


0.1) 
6.2) 


X-OHIT 
0.0 


SEX 
HALE 


FEMALE 


9^43         12923641  2X) 


9637         1271457C  2X) 


O.lt  0.0) 
296.0(13.0) 

0.2(  0.0) 
306. 8(  8.9) 


ll.lt  0.8) 

295. Jli  1.9) 

14. 6(  0.9) 

303. 0(  1.6) 


63. 3( 
291. IC 


0.5) 
0.9) 


69. 3(  0.5) 
298.71  0.8) 


21. 8(  0.7) 

?65.4(  1.5) 

13. 7(  0.7) 

268. 6(  1.8) 


3,1(  0.3) 

251. 9(  2.1) 

1.7(  0.2) 

255. 4(  3.8) 


0.5( 
241. 7( 

0.5( 
248. 1( 


0.1) 
7.1) 

0.1) 
9.3) 


0.0 


0.0 


miNICITY/RACE 
UNITE 


DUCK 

HISPANIC 

OTHER 


13914         19065471  OX) 


2792  3834931  IX) 


1699  203453f  2X) 


675 


703291  3X) 


0.2t  0.0) 
306. 4t  6.8) 

0.3(  0.1) 
301.5(24.4) 

O.K  0.1> 
258.3(»»»») 

0.3(  0.0) 
288.9(33.0: 


13. 0(  0.8) 

305, 1»  t.«J) 

14. 2t  1.5) 

279,11  2.9) 


9.41 
283. 7( 

12, 5( 
306, 6( 


2.2) 
3.4) 

1.1) 
3.8) 


70. 2(  0.2) 

299. 5C  0.7) 

53. 7(  0.4) 

274. 6(  1.0) 

55. 2t  1.2) 

279, 4t  2,3) 

61. 4t  1.3) 

295, It  2,0) 


15. 3(  0.7) 

273. 5(  1.8) 

24. 7(  0.9) 

254. 2(  2.3) 

27. 4(  1.1) 

251. 2t  2.3) 

18. 1(  1.8) 

268. 2(  5.0) 


1.2(  0.1) 

264. 1(  4.1) 

5.8(  0.9) 

246. 7(  3.4) 

6.2(  0.8) 

246. 8(  3.1) 

5.6(  1.4) 

246. 6(  6.1 ) 


0.2( 
263. 3( 


0.0) 
8.6) 


1.3(  0.3) 
234. 0(  8.0) 

1.7(  0.4) 
245.8(12.1) 

2.1(  0.4) 
234.1^(12.1) 


0.0 
0.0 
0.0 
0,0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S, 


GRADUATED  H,S, 


POST  H.S. 
UKNOMN 


2300 


6600 


596 


293458(  5X) 


865215(  3X) 


9378         1301603(  3X) 


78893(  5X) 


O.K  0.1) 
262.4(15.3) 

0.2(  0.1) 
286.7(11.2) 


0.2( 
321. 5( 


0.1) 
9.4) 


0.0(  0.0) 

)()(ii}(}((  0.0) 


8.6( 
285. 0( 


1.2) 
3.1) 


10. 9(  0.9) 

291. 0(  1.8) 

15. 3(  0.8) 

306. 5(  1.7) 

9.8(  1.3) 

269. 2(  6.5) 


54. 4(  1.3) 
279. If  1.2) 


66. 9( 
287. 6( 

69. 4( 
303. 8( 


0.7) 
0.7) 

0.5) 
0.8) 


51. 2(  1.5) 
269. 2(  2.9) 


28. 6(  1.2) 

254. 1(  2.0) 

19. 4(  0.8) 

259. 9(  1.5) 

13. 7(  0.7) 

282. 0(  2.0) 


28. 0( 
244. 2( 


1.5) 
3.2) 


6.9(  0.8) 

245. 4(  3.3) 

2.2(  0.3) 

258. 8(  4.3) 

l.K  0.1) 

260. 1(  4.3) 

9.1(  1.6) 

244. 9(  5,3) 


1.5(  0.4) 
237.3(14.3) 

0.4(  0.1) 
248.1(12.8) 

0.3(  0.1) 
258. 8(  8.0) 

2.0(  0.5) 
230.2(13.4) 


0.0 
0.0 
0.0 
0.0 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


ij    IR  OLDER 


ERIC 


645 


1992 


334011(  6X) 


14009         1699683(  OX) 


3079  530128C  3X) 


1,4(  0.3) 

303. 1(  7.7) 

0.0(  0.0) 

mmiiiil  0.0) 

0.0(  0.0) 

mumiiC  0.0) 


98. 6(  0.3) 
299.6^  1.4) 


0.0( 

iiiiiiiiii( 


0.0) 
0.0) 


0.0(  0.0) 
iiiiiiiiii(  0.0) 


0.0(  0.0) 

iiiiiiiiii(  0.0) 

100. 0(  0.0) 

295. 0(  0.7) 

O.OC  0.0) 

)l)ll»H(^  0.0) 


0.0(  0.0) 

)!««««(  0.0) 

0.0(  0.0) 

Mllllll(  0.0) 

86. 0(  1.0) 

266.61  1.4) 


0.0(  0.^) 

KKKKKl  0.0) 

0.0(  0.0) 

)!««««(  0.0) 

11. 6(  0.8) 

253. 1(  2.0) 


0.0(  0.0) 

«lffiiff«(  0.0) 

0.0(  0.0) 

HIIKKIIC  0.0) 

2.5(  0.3) 

244. 6(  6.2) 


0.0 


0.0 


0.0 


646 


J!?TLTrrn:!Sn5^S"3[?!-*^  WITING  ASSESStlElir  -  STUDENT  qUESTIWItlAIRE  -  HTH  eRADERS 
SIZEmpE  OF  COmUNITY 


—  TOTAL 


SEX 
HALE 


FEMALE 


ET1INICITY/RACE 
NHITE 


BUCK 

HISPANIC 

0T1IER 


PAREffTAt  EDUCATIOt* 
NOT  6RA0UATED  H.S. 


6RA0UATE0  H.S. 
POST  H.S. 
Ulfi<riONN 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


IS  OR  OLDFR 


N 

19080 

9637 

13914 
2792 
1699 
675 

2300 
6600 
9378 
596 

1992 
14009 
3079 


ERLC 


647 


HEIGHTED  N 
2563e22(  OX) 

1292364C  2X1 
12714571  2X) 

1906547C  OX) 
3834931  IX) 
203453C  2X) 
703291  3X) 

2934581  5X) 
86521!;f  3X) 
1301603C  3X) 
78893C  5X) 

3340111  6X) 
16996831  OX) 
5301281  3X) 


RURAL 


OIS  URD 


AOV  URD 


DIG  CITY 


FRI^IGE 


NEOIUH 


SMALL 


5.3C  1.2)  10. 6C  2.3)  16. 4C  2.7)  8.8C  2.1)  10  2C  2  7)  16  Af  i  qi  Ti  Qi  i  ^% 
2e..6.  3.2)  267.8.  2.5)  300.2.  3.0)  290.8.  Z.ll  zllll  l.l]  z\l.Vi  [.J! 


5.4C  1.2)    10. 3C  2.2)    17. 6(  2.6)      7.5!  2.1)    10.21  2  7)  16  9f  9  ni  i9  m  i 

279.9.  3.2)  263.9.  2.1)  295...  3.2)  28..5.  IM  zl'.ll  III  zl'.H  HI  dlVi  IM 

5.3.  1.2)    10.9.  2.'>)    15.2.  3.0)    10.0.  2.5)    10.3.  2.7)  16  6.  1  2)  ti  7i  i  7i 

289.5.  3.5)  271.5.  3.0)  305.9.  2.9)  295.6.  1.7)  ?93.6.  5.71  2^7  3.  1.1)  295:5!  l.ll 


5.2.  1.1)      <^.o.  l.<i)    18.5.  3.1)      7.2.  2.0)    11. 0(  2  9)    17  41  1  11  ai  1  oi 

293.7.  1.2)  281.2.  3.6)  302.9.  3.3)  299.0.  1.7)  zWZ  III  zVv  Vi  lizl  29":'!  ;:8'! 

7.3.  3.3)  32.2.  9.1)  7.9.  2.5)  11.6.  3.8)  8.2.  3.5)  12  6.  2  <>)  TO  ?l  t  qi 
259...  3.0)  262.1.  2.9)  283.0.  5..)  273...  2.6)  279.2.  I  M  zHXi  z  sl  dl  sl  z  zl 

..6.  3.2)    28.6.  9.3)    12.5.  6.9)    15.6.  6.5)      6.1.  2.9)    19  7.  8  2)    15  Of  &  Ql 

262.8.  ..1)  262.8.  2.1)  279.9.  2.8)  277.3.  ..9)  .^iis!  IM  ^V.H  HI  \\ll 

2.2.  0.9)    20.7.  5.2)    17.1.  ..3)    15.3.  5.1)    12.9.  4  4)    14  ti  9  11         r,  ,  -» 


III  .^:[\  tv,  ir,     is;     iv,  if, 

.15.1:  l-ll IM      m .j-;  in 

3.3C  0.7)  7.3C  1.7)  24. 4C  4.1)  9.2C  2.4)  10. 8C  2  9)  in  91  i  ai  9a  m  i  -fi 
301.0.  2.5)  280.8.  2.6)  306.1.  2.3)  298.9.  2.1)  2;!:;.  l.ll  zlt.H  \M  [[H 

5  ?!  '-^f  2.8»    10.5.  3.3)    15.6.  2  5)    20  91  ?  11 

261.2.  4.4)  248.1.  2.7)  264.2.  6.0)  263.9.  9.7)  275.5.  4.6)  265.3.  7  0)  zll  n  I  sl 


,oo  =!  2  21.1.  4.1)    10.3.  2.9)     11.4.  3.0)     17  2.  3  0)    24  6.  9  51 

298.5.  6.6)  279.0.  3.4)  307.0.  3.9)  302.4.  3.2)  294.6.  2.4!  3^2.3.  2  J)  35;:9.  2.1) 

I  ll  2.9)      8.4.  2.0)    10.9.  2.9)  17.5. 

290.2.  2.7)  276.5.  2.7)  303.5.  2.6)  296.1.  1.9)  295.2.  1.2)  296.6. 


1..)  32.8. 
1..)  2?4.9. 


1.6) 
0.9) 


6.0.  1.4)  18.5.  3.6)  11.7.  2.0)  9.0.  2.4)  7  3.  2  0)  14  01  9  9\  tt  &i  «  i> 
262.3.  4.3)  251.4.  2.1)  277.4.  266.7.  5.1)  263':4'!  2':5°!  2J6:;!  2M!  266:;.'  l.ll 


X-OHIT 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


648 


Table  15(43) 


HAEP    1963-64  READING  Mil  MRITIfIG  ASSESSriEHT    *    STUDEKT  QUESlIOttlAIRE    -    IITN  GRADERS 
MEIGHTEO  RESPONSE  PERCENTAGES  AND  GENERAL  READING  PROFICIENCY  NEANS  -  REPORTING  VARIABLES 
INEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PUUSIBLE  VALUES) 

PAREHTAL  EDUCATION 


N 

NEIGHTED  N 

NOT  HS 

GRAD 

HS 

POST 

HS 

UNKNONN 

X-OMIT 

—  TOTAL  — 

16674 

25391691 

IX) 

11. 6( 
269. 5( 

0.6) 
1.2) 

34, 1( 
261. 6( 

1.0) 
0,7) 

51.  3f 
300. 6f 

1.3) 
0,9) 

3. If 
259. 2f 

0.2) 
2.1) 

1.0 

SEX 
MAU 

9319 

1277103C 

2X) 

10. 5( 
264. 5( 

0.6) 
1.5) 

33. 9C 
275. 9C 

1.1) 
1.0) 

52. 2f 
295, 9f 

1.4) 
1,1) 

3.5f 
256. 4f 

0.2) 
2.6) 

1.2 

FEHAIE 

9555 

12620671 

2X) 

12. 6( 
273,71 

0.7) 
1.6) 

34. 3( 
267. 7( 

1.2) 
0.6) 

50. 4f 
305. 6f 

1.5? 
1.0) 

2.7f 
260. 3f 

0.3) 
3.3) 

0.7 

ETHNICITY/RACE 
WHITE 

13736 

1665096 C 

IX) 

e.K 

277. 6( 

0.6) 
1.6) 

34, 7( 
267, 1( 

1.3) 
0.7) 

55. 5f 
304. 6f 

1.6) 
1,0) 

1.6f 
272. 2f 

0.11^ 
3.1) 

1.1 

BUCK 

2776 

361266 C 

IX) 

16  «:t 
259. 0( 

1.4) 
2,6) 

37, 7( 
262, 4( 

1.4) 
2.0) 

40. 6f 
279, 7f 

1.9) 
1,9) 

5.M 
249. 6f 

0.6) 
4.3) 

0.6 

HISPANIC 

1691 

2027231 

2X) 

34. 5t 
261. 6( 

4.4) 
1.6) 

25. 7( 
268, 7( 

2,5) 
2.5) 

30, If 
266. If 

3.6) 
2,0) 

9.6f 
247. 2f 

1.2) 
2.2) 

0.4 

OTHER 

671 

70065C 

3X) 

11. 3( 
265, 4( 

1.3) 
5.6) 

22, 9( 
279, 7t 

1.9) 
4.0) 

57. 5f 
299.  3f 

2.2> 
2.1) 

6.2f 
258. Of 

1.2) 
5.2) 

0.4 

PARENTAL  EDUCATIOI^ 
NOT  GRADUATED  H.S. 

2300 

293456C 

5X) 

100.01 
269. 5( 

0.0) 
1.2) 

0  .0( 

0.0) 
0.0) 

0.  Of 
mmiiiif 

0,0) 
0.0) 

O.Of 
Kimiiiif 

0.0) 
0.0) 

GRADUATED  H.S, 

6600 

6652151 

3X) 

0.0( 

0.0) 
0.0) 

100. Of 
261 .6f 

0.0) 
0.7) 

O.Of 
mmiiiif 

0.0) 
0.0) 

O.Of 
iHdmiif 

0.0) 
0.0) 

0.0 

POST  H.S. 

9376 

13016031 

3X) 

0.0( 

0.0) 
0.0) 

O.Of 
mmiiiif 

0.0) 
0.0) 

100. Of 
300, 6f 

0.0) 
0.9) 

O.Of 
mumiif 

0.0) 
0.0) 

0.0 

UNKNONN 

596 

786931 

5X) 

0.0( 

0.0) 
0.0) 

O.Of 

0.0) 
0.0) 

O.Of 

0.0) 
0.0) 

100. Of 
259. 2( 

0.0) 
2.1) 

o.o 

AGE 

16  OR  YOUNGER 

1974 

3316641 

6X) 

7.7C 
264. 6( 

0.9) 
3.1) 

29. If 
290, 9f 

1.7) 
1.6) 

60. 6f 
306, 7f 

2.2) 
1,7) 

2.3f 
269. 2f 

0.3) 
6.5) 

0.7 

17  YEARS  OLD 

13849 

1662103C 

IX) 

9.5( 
279. IC 

0.6) 
1.2) 

34. 4f 
267. 6f 

1.1) 
0.7) 

53. 7f 
303. 6f 

1.3) 
0.6) 

2.4f 
269. 2f 

0.1) 
2.9) 

1.0 

O  OLDER 

ERIC 


.^051 


525383f  4X) 


20. 6f  1.3) 
251. 6f  1.7) 


16. If  1.2) 
259. 6f  1.5) 


37. 4f  1.5) 
260. Of  1.9) 
618 


5.9f 
243. 7f 


0.4) 
2.6) 


0.9 


TaHe  15(44) 

KACP    1983-84  READING  AND  HRITIN6  ASSESSHENT    -    STUDENT  qUESTIONNAIRE    -    IITH  GRADERS 

'^JSIS  ^^r'^^^'S^^^ifi?.^  GENERAL  READING  rPOFICIENCt  MEANS  -  REPORTING  VARIABLES 
inEANS  ARE  BASED  ON  A  SINGLE  SET  OF  PLAUSIBLE  VALUES) 

PERCENT  AT  OR  ABOVE  ANCHOR  POINTS 


—  TOTAL  ^- 

SEX 
tfALE 
FENALE 

ETflNICITY/RACE 
UNITE 
BUCK 
HISPANIC 
OTHER 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 
GRADUATED  H.S. 
POST  H.S. 
UNKNOUN 

AGE 

16  OR  YOltNGER 

17  YEARS  S)LO 

18  OR  OLOr.9 


N 

19080 


9443 
9637 


13914 
2792 
1699 
675 


2300 
6600 
9378 
596 


1992 
14009 
3079 


HEIGHTED  N 
2563822C  0X1 


129236^C  2X1 
1271457C  2X1 


19065^7C  0X1 

383^  93C  1X1 

203^53C  2X1 

70329(  3X1 


293^58(  5X) 

865215(  3X1 

1301603(  3X1 

78893(  5X1 


334011(  6X) 
1699683(  OX) 
530128(  3X) 


150 
100. OC  0.0) 


100. OC  0.0) 
100. OC  0.0) 


100. OC  0.0) 

100. OC  0.0) 

100. OC  0.0) 

100. OC  0.0) 


100. OC  0.0) 

100. OC  0.0) 

100. OC  0.0) 

99. 9C  0.1) 


100. OC  0.0) 
100. OC  0.0) 
100. OC  0.0) 


200 
98. 7C  0.1) 


98. 3C 
99. IC 


0.2) 
0.1) 


99. 3C  0.1) 

97. IC  0.5) 

96. 2C  0.^) 

99. OC  0.^) 


96. 9C  0.5) 

98. ^C  0.2) 

99. 5C  0.1) 

94  3C  1.0) 


99. 4C  0.2) 
99. 6C  0.1) 
95. ^C  0.^) 


250 
84. 8C  0.5) 


81. IC  0.8) 
88. 5C  0.5) 


89. 6C  0.5) 

68. 8C  1.5) 

73. OC  2.0) 

81. 9C  2.3) 


70. 8C  1.3» 

81. 7C  0.8) 

91. 5C  0.^) 

60. 6C  3.0) 


92. 2C  0.6) 
89. 2C  0.5) 
66. IC  1.1) 


300 
40. 2C  0.8) 


35. 6C  0.9) 
^^.8C  1.0) 


46. 6C  0.9) 

19.1C  1.5) 

19. 5C  1.7) 

40. 0(  1.9) 


20. 1(  1.1) 

31. 3C  0.7) 

52. 2C  0.9) 

13.6C  2.2) 


50. 5C  1.9) 
^5.3C  0.7) 
17.0C  1.2) 


350 
5.0C  0.31 


3.7C  0.3> 
6.2C  0.41 


6.1C  0.31 

l.OC  0.21 

1.4C  0.41 

5.2C  0.91 


I.IC  0.31 

2.6C  0.21 

7.6C  0.41 

0.5C  0.41 


7.8C  1.01 
5.6C  0.31 
I.IC  O.EI 


650 


619 


Table  15(45) 

NAEP  198S-8^  READING  AND  WRITING  ASSESSMENT  -  STUDENT  QUESTIWJNAIRE  -  ^TH  GRADERS 
WEIGHTED  RESPONSE  PERCENFAGES  AfJD  GENERAL  WRITING    A.R.M.  MEANS  -  REPORTING  VARIABLES 


IMPUTED  STUDENT  GRADE 


TOTAL 


MEIGHUU  H 


GRADE  4 


8007         14080A7(  I/.}         100. C(  0.0) 

1.53(0.01) 


X-OMIT 
0.0 


SEX 
MALE 


FCriALE 


^410 


694799(  2X) 
713248(  IX) 


100. 0(  0.0) 
1.50(0.01) 

100. 0(  0.0) 
1.66(0.01) 


0.0 


0.0 


ETHNICITY/T?ACE 
UlfirE 


BLACK 
HISPANIC 


OTHER 


5931         1016633(  IX) 


1298  196^^*61  2X) 


1169  152335(  5X) 


409 


42633(  7X) 


100. 0(  0.0) 
1.63(0.01) 

100. 0(  0.0) 
1.38(0.02) 

100. 0(  0.0) 
1.^6(0.02) 

10C.0(  0.0) 
1.60(0.03) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 

UNKNOWN 


526 


77745(  5X) 


1793  275493(  4X) 


3364  549929(  3X) 


306/  495435 (  2X) 


100. 0(  0.0) 
1.43(0.03) 

100. 0(  0.0) 
1.54(0.01) 

100. 0(  0.0) 
1.66(0.01) 

100. 0(  0.0) 
1.53(0.01) 


0.0 


0.0 


0.0 


0.0 


AGE 

9  YEARS  OLD 


10  OR  OLDER 


ERLC 


5795         1026813(  IX) 


2937  371667(  2X) 


100. 0(  0.0) 
1.60(0.01) 

100. 0(  0.0) 
1.52(0.01) 


0.0 


0.0 


620 

651 


Table  15(46) 

2?TLTrn®o:!!L?.!^°^^  *^  siting  assessment  -  STUDENT  QUESTIONtMlRE  •  4TH  GRADERS 
WEIGHTED  RESPONSE  PERCENTAGES  AND  GCr^ERAL  L-RITING    A.R.h.  MEANS  -  REPORTING  VARIABLES 


STUDENT  SEX 


-  TOTAL  — 


N 
6007 


HEXGHTED  N 
14060<»7(  IX) 


MALE 

0.6) 
1.50(0.01 ) 


FEMALE 

50. 7(  0.6) 
1.66(0.01) 


X-OMIT 
0.0 


SEX 
MALE 


FEtlALE 


<K»10  694799(  2X) 


^397  713C^e(  IX) 


100. 0(  0.0) 
1.50(0.01) 

0.0(  0.0) 
««^^^(0.0  ) 


0.0(  0.0) 

100. 0(  0.0) 
1.66(0.01) 


0.0 


0.0 


ETHNICITT/PACE 
KHITE 


BLACK 

HISPANIC 

OTHER 


5931         1016633(  1'/,) 


1298  196^^6(  2/C) 


1169  152335(  5X) 


^09  ^2633(  7X) 


^9.2(  0.6) 
1.55(0.01) 

^5.2(  1.6) 
1.30(0.02) 

5'*.1(  1.7) 
l.^O(O.OC) 

5J.9(  2.6) 
1.53(0.05) 


50. 6(  0.6) 
1.71(0.01) 

5^.6(  1.6) 
1.^5(0.02) 

^5.9(  1.7) 
1.53(0.03) 

^6.1(  2.6) 
1.6d(0.0^n 


0.0 


<3.0 


0.0 


0.0 


PARENTAL  EDUCAiI0T4 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S. 


UNKNOU-N 


526 


777'^5(  57.) 


1793  275493(  ^V) 


336^  549929(  3X) 


3067  495^ 35(  ZA) 


^6.^(  3.^) 
1.35(0.0^) 

50. 2(  1.3) 
1.^5(0.01) 

51. 9(  0.9) 
1.50(0.01) 

^6.5(  1.2) 
1.^5(0.02) 


53. 6(  3.^) 
1.51(0.03) 

<»9.6(  1.3) 
1.6^(0.02) 

<»6.1(  0.9) 
1.75(0.02) 

53, 5(  1.2) 
1.59(0.01) 


0.0 


0.0 


0.0 


0.0 


AGE 

9  YEAKS  OLD 


10  OP  OLDER 


5795         1026813(  V/.) 


2937  371667(  2X) 


^6.7(  0.7) 
1.52(0.01) 

56. 9(  1.0) 
l.<»5(0.02) 


53. 3(  0.7) 
1.67(0.01) 

<^3.1'  1.0) 
1.60(0.02) 


0.0 


0.0 


ERLC 


Table  15(47) 

NAEP  19a3-«^  READING  AND  MRITINe  ASSES5MFNT  -  STUDENT  QUESTIONNAIRE  -  4TH  GRAPERS 
MEXGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  WRITING    A.R.M.  MEANS  -  REPORTING  VARIABLES 


ETHMICITY/RACE 


N 

WEIGHTED 

N 

wnTE 

BLACK 

HISPANIC 

AMCR  IND 

ASIAN 

UNCLASS 

X-OttlT 

—  TOrAL  — 

6607 

1406047( 

VA) 

72. 2(  0.5) 
1.63(0.01) 

14. 0(  0.3) 
1.38(0.02) 

1C.8(  0.5) 
1.46(0.02) 

1.2(  0.1) 
1.56( 0.04) 

1.6(  0.2) 
1.63( 0.04) 

0.0(  0.0) 

1 .001 0 . 23 1 

0.0 

SEX 
MALE 

4410 

694799( 

zy.) 

72. 0(  0.6) 
1.55(0.01) 

12. 8(  0.6) 
1.30(0.02) 

11. 9(  0.5) 
1.40(0.02) 

1.6(  0.2) 
1.50(0.03) 

1.7(  0.3) 
1.53(0.07) 

0.0(  0.0) 
1.95(1.63) 

0.0 

FEHALE 

439/ 

713246( 

ly.) 

72. 4(  0.6) 
1.71(0.01) 

15. 1(  0.5) 
1.45(0.02) 

9.8(  0.6) 
1.53(0.03) 

0.9(  0.1) 
1.66(0.05) 

1.6(  0.2) 
1.70(0.06) 

0.0(  0.0) 
1.57(0  33) 

0.0 

ETHNICITY/RACE 
NIIITE 

5931 

1016633( 

y/.) 

100. 0(  0.0) 
1.63(0.01) 

0.0(  0.0) 
«K1(«1((0.0  ) 

0.0(  0.0) 
«##^-i((0.0  ) 

0.0(  0.0) 

)(KKK#(0.0  ) 

0.0(  0.0) 
tf)«)^x«(0.0  ) 

0.0(  0.0) 

#KM)(M(0.0  ) 

0.0 

BLACK 

1296 

1 96446 ( 

zy) 

0.0(  0.0) 
«y4ntit(0.0  ) 

100. 0(  O.U) 
1.38(0.02) 

0.0(  0.0) 

«(#K¥«(0.0  ) 

0.0(  0.0) 
0.0  ) 

0.0(  0.0) 
y#»#K(0.0  ) 

0.0(  0.0) 
«#^#K(0.0  ) 

0.0 

HISPANIC 

1169 

152335( 

sy) 

0.0(  0.0) 
«»«««( 0.0  ) 

0.0(  0.0) 

«»#M)l#(0.0  ) 

100. 0(  0.0) 
1.46(0.02) 

0.0(  0.0) 

«^(^)(K(0.0  ) 

0.0(  0.0) 

^()l«#K(0.0  ) 

0.0(  0.0) 

#KK«M(0.0  ) 

0.0 

OTHER 

409 

42633( 

7y) 

0.0(  0.0) 

tfM##1t(0.0  ) 

0.0(  0.0) 
)(K«)()((C.O  ) 

0.0(  0.0) 

K#K##(0.0  ) 

40. 7(  4.4) 
1.56(0,04) 

58. 1(  4.4) 
1.63(0.04) 

l.K  0.7) 
1.60(0.25) 

0.0 

PARENTAL  E0UCATI0t4 
NOT  GRADUATED  H.S. 

526 

77745( 

sy.) 

64. 5(  3.1) 

X  .  H Tl  U  .  Uj  i 

14. 2(  2.1) 
1  T9f  n  n^A  1 

X  .  jCx  U  .  II  f  1 

19. 1(  2.1) 
1  32(0  07) 

1.5(  0.5) 
1  30(0  2L ) 

C.7(  0.3) 
1.40( 0.21 ) 

0.0(  0.0) 
««y¥¥(0.0  ) 

0.0 

GRACUA1ED  H.S. 

1793 

275493( 

4X) 

72. 7(  1.3) 
1.60(0.02) 

15. 1(  0.9) 
1.36(0.02) 

10. 4(  1.1) 
1.42(0.03) 

l.K  0.2) 
1.54(0.06) 

0.7(  0.2* 
1.41(0.09) 

0.0(  0.0) 
##«(#^(0.0  ) 

0.0 

POST  H.S. 

3364 

549929( 

73. 7(  0.6) 
1.72(0.01) 

13. 5(  0.9) 
1.44(0.02) 

9.4(  0.6) 
1.56(0.03) 

1.3(  0.2) 
1.63(0.06) 

2.1(  0.3) 
1.73(0.05) 

0.0(  0.0) 
1.64(1,69) 

0.0 

U;  (KNOWN 

3067 

495435( 

zy) 

72. 0(  0.9) 
1.56(0.02) 

13. 6{  0.7) 
1.34(0.03) 

U.K  0.7) 
l.'»2(0.02) 

l.K  0.2) 
1.55(0.06) 

2.1(  0.3) 
1.57(0.07) 

O.K  0.1) 
1.64(0.47) 

0.0 

AGE 

9  YEARS  OLD 

5795 

1026813( 

VA) 

73. 9(  0.6) 
1.65(0.01) 

12. 6(  0.3) 

1.40(0. on) 

10. 5(  0.5) 
1.48(0.02) 

l.K  0.1) 
l.B6(0.05) 

1.8(  0.2) 
1.63(0.05) 

0.0(  0.0) 
1.67(0.19) 

0.0 

10  OR  OLDER 

2937 

371667( 

zy) 

67. 7(  1.0) 
1.56(0.01) 

17. 7(  0.9) 
1.35(0.03) 

11. 6(  0.9) 
1.41(0.03) 

1.5(  0.2) 
1.5&(0.08) 

1.4(  0.3) 
1.56(0.07) 

0.0(  U.O) 
1.50(1.31) 

0.0 

ERIC 


653 


622 


654 


•  •  •  Tkble  15(48) 

HAfF  IteS-M  RUOXNS  AND  MXTINB  ASSCSSnCNT  -  STUDCICr  WESTIONNATHE  -  CRAOERS 
MCXigHTEO  KESPONSE  PCRCENTACeS  AND  SCNC9AL  MIITXN6    A.R.N.  NCANS  -  REPORTING  VARIADUS 


RC6I0N 


N  NCIGHTEO  N  NC  SE  CENTRAL  MEST  XHMIT 

—  TOTAL—  6607         14060471  1X1  22. 5(  0.5)       23. 6(  14)       27. 3C  1.3)       26. 6(  0.7)  0.0 

1.61(0.02)        1.54(0.02)        1.60(0.02)  1.57(0.01) 


sex 

NALE 


rttULE 


4410  6947991  2X) 


4397  713246(  IX) 


22. 6(  0.7) 
1.53(0.02) 

22. 2(  0.6) 
1.66(0.02) 


22. 6(  1.4) 
1.47(0.02) 

24. 4(  1.5) 
1.61(0.02) 


26. 7(  1.5) 
1.51(0.02) 

C7.9(  1.5) 
1.66(0.02) 


27. 7(  0.6) 
1.49(0.02) 

25. 6(  0.6) 
1.65(0.01) 


miNicmr/RACE 

HHITE 


BLACK 

HISPANIC 

OTHER 


5931         1016633(  IX) 


1296  196446 (  2X) 


1169 


409 


1523I5(  5X) 


426 33(  7X) 


23. 2(  0.4) 
1.66(0.02) 

21. 1(  1.2) 
1.39(0.04) 

20. 6(  4.0) 
1.46(0.04) 

17. 4(  4.1) 
1.66(0.05) 


21. 1(  1.6) 
1.61(0.00 

45. 6(  1.2) 
1.36(0.03) 

14. 3(  4.5) 
1.47(0.05) 

14. 1(  3.0) 
1.61(0.06) 


31. 4(  1.6) 
1.63(0.02) 

19. 7(  3,1) 
1.40(0.04) 

12. 0(  2.6) 
1.47(0.03) 

19. 6(  3.4) 
1.61(0.06) 


24. 3(  0.5) 
1.63(0.01) 

13. 6(  3.1) 
1.37(0.03) 

52. 9(  2.3) 
1.45(0.03) 

46. 9(  6.1) 
1.56(0.05) 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNGltl 


526 


77745(  5X) 


1793  275493(  4X) 


3364  549929(  3X) 


3067  495435(  2X) 


16. 3(  1.5) 
1.49(0.06) 

22. 4(  2.1) 
1.57(0.03) 

21. 9(  1.3) 
1.70(0.02) 

23. 9(  1.2) 
1.55(0.02) 


32. 0(  2.6) 
1.43(0.04) 

26. 0(  2.3) 
1.51(0.03) 

22. 5(  1.6) 
1.63(0.02) 

22. 1(  2.0) 
1.49(0.02) 


23.5(  2.9) 
1.45(0.06) 

30. 2(  2.5) 
1.56(0.02) 

27. 1(  1.6) 
1.67^0.02) 

26. 6(  1.9) 
1.56(0.02) 


26. 3(  2.3) 
1  39(0.06) 

21. 4(  1.5) 
1.53(0.03) 

26. 5(  1.4) 
1.66i0.02) 

27. 2(  1.2) 
1.51(0.01) 


AGE 

9  YEARS  OLD                           5795         1026613(  IX)           22. 9(  0.6)  23. 4(  1.5)  27. 3(  1.5)  26. 4(  0.6) 

1.62(0.02)  1.57(0.02)  1.61(0.02)  1.59(0.01) 

10  OR  OLDER                           2937          371667(  2X)           20.91  l.X)  24. 2(  1.4)  27. ^(  1.7)  27. 4(  1.0) 

1.56(0.04)  1.46(0.52)  1.55(0.03)  1.51(0.02) 


ERLC 


623 

655 


Table  15(49) 

Jli^lL^i?*'"**  REA0XN6  AND  MRITINB  ASSESSWCNT  -  5TUDEMT  QUESTIONNAIRE  -  4TH  GRADERS 
HEimTCO  RESPONSE  PERCENTAGES  AND  GENERAL  NRITINS    A.R.H,  HEANS  -  REPORTING  VARIABLES 

rnVTED  S1U9ENT  AGE 


—  TOTAL 


N 
8807 


WEIGHTED  N 
14080471  IX) 


7-LESS 

O.Of  0.0) 
1.7211.68) 


0.7f  0.1) 
1.5510.07) 


72.91  0.4) 
1.6010.01) 


10 

23.41  0.4) 
1.5310.01) 


11 

2.7(  0.2) 
1. 4010.04) 


12-NORE 

0.3(  0.1) 
1.35(0.07) 


X-OMIT 
0.0 


SEX 
HALE 


FEMALE 


4410 


4397 


694799(  2X) 
7132481  IX) 


O.Of  0.0) 
ffii»»«(0.0  ) 

0.0(  0.0) 
1.72(1.68) 


0.5(  0.1) 
1.48(0.12) 

0.8(  0.2) 
1.59(0.09) 


69.0(  0.8) 
1.52(0.01) 

76. 7(  0.5) 
1.67(0.01) 


26. 6(  0.8) 
1.47(0.02) 

20. 3(  0.5) 
1.62(0.02) 


3.5(  0.3) 
1.35(0.06) 

2.0(  0.2) 
1.49(0.04) 


0.4(  0.1) 
1.29(0.07) 

O.K  0.1) 
1.54(0.19) 


0.0 


0.0 


ETHNICITT/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


5931 
1298 
1169 
409 


1016633(  IX) 
196446 (  2X) 
152335(  5X) 
42633(  rX) 


0.0(  0.0) 

IHIIIIIII(0.0  ) 

0.0(  0.0) 

IHHHf)l(0.0  ) 

0.0>  0.0) 

IHI4HHf(d,0  ) 

0.2(  0.2) 
1.72(1.68) 


0.6(  0.1) 
1.61(0.08) 

0.9(  : 
1.25(0.X6) 

0.6(  0.2) 
1^4(0.14) 

2.2(  0.6) 
1.81(0.14) 


74. 7(  0.5) 
1.65(0.01) 

65. 7(  1.3) 
1.40(0.02) 

70. 7(  1.5) 
1.48(0.02) 

72. 1(  1.6) 
1.61(0.03) 


22. 7(  0.5) 
1.59(0.01) 

26. 6(  1.4) 
1.37(0.03) 

24. 2(  1.6) 
1.42(0.02) 

21. 4(  1.5) 
1.56(0.06) 


1.9(  0.2) 
1.49(0.04) 

6.1(  0.7) 
1.27(0.06) 

3.9(  0.6) 
1.34(0.08) 

3.9(  0.8) 
1.52(0.12) 


O.K  0.0) 
1.44(0.12) 

0.7(  0.2) 
1.27(0.13  ) 

0.5(  0.2) 
1.32(0.08) 

0.2(  0.2) 
1.48(0.47) 


0.0 
0.0 
0.0 
0.0 


PARENTAL  EDUCATION 
NOT  GR^UATCO  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOIM 


526 
1793 
3364 
3067 


77745(  5X) 
275493(  4X) 
549929(  3X) 
4954 35(  2X) 


0.0(  0.0) 

IHIIHHI(  0.0  ) 

0.0(  0.0) 

IHIII)()((0.0  ) 

0.0(  0.0) 

)()()(iiii(0.0  ) 

0.0(  0.0) 
1.72(1.68) 


0.2(  0.2) 
0.97(0.24) 

0.4(  0.1) 
1.37(0.1:;: 

0.8(  0.1) 
1.66(0.12) 

0.7(  0.2) 
1.47(0.09> 


61. 3(  2.9) 
1.45(0.03) 

69.1(  1.2) 
1.56(0.01) 

75. 6(  0.7) 
1.68(0.01) 

73. 9(  0.9) 
1.54(0.02) 


32. 4(  2.5) 
1.41(0.C^) 

26. 9(  1.1) 
1.51(0.02) 

21.6(  0.7) 
1.62(0.02) 

22. 1(  0.8) 
1.30(0.02) 


5.9(  1.1) 
1.40(0.09) 

3.4(  0.4) 
1.40(0.04) 

1.8(  0.2) 
1.42(0.09) 

2.9(  0.4) 
1.40(0.06) 


0.3(  0.2) 
1.33(0.12) 

0.2(  0.1) 
1.58(0.17) 

O.K  0.0) 
1.36(0.16) 

0.4(  0.1) 
1.35(0.13) 


0.0 
0.0 
0.0 
0.0 


A6E 

9  YEARS  OLD 


10  OR  OLDER 


ERLC 


5795 
2937 


1026813(  IX) 
371667(  2X) 


0.0(  0.0) 
iHmiHKO.O  ) 

0.0(  0.0) 

IHIIHIII(0.0  ) 


0.0(  0.0) 

IHII(«ll(0.0  ) 

0.0(  0.0) 

IH(IIIIK(0.0  ) 


624 


100. 0(  0.0) 
1.60(0.01) 

0.0(  0.0) 

4HIIHIII(  0.0  ) 


0.0(  0.0) 

IHH(ll«(0.0  ) 

88.7(  0.7) 
1.53(0.01) 


0.0(  0.0) 

'^«IHI(0.0  ) 

10. 3(  0.7) 
1.40(0.04) 


0.0(  0.0) 

IH|llll«(0.0  I 

1.0(  0.2) 
1.35(0.07) 


0.0 


0.0 


657 


Table  15(50) 

NMEP   19SS-a%  RCADIN6  JMI  ASSESSHENT    -   SfUDEHT  QUESTIONNAIRE    -    4tH  GPAOERS 

NEI6HTE0  RESPONSE  PERCENTAGES  AND  6ENERAL  NRITINS    A*R.H*  MEANS  -  REPORTING  VARIABLES 


sncmpE  OF  comuNmr 


—  TOTAL 


N 
8807 


NEIGHTED  H 
1408047!  IX) 


RURAL 

6.2(  0.9) 
1.5310.02) 


OIS  UR8 

11. 6C  2.0) 
1.42(0.02) 


ADV  URB 

14. 8C  2.2) 
1. TOCO. 02) 


DIG  CITY 

7.4C  1.6) 
1.55C0.03) 


FRINGE 

11. IC  1.6) 
1.60C0.02) 


NEOIUH 

16. 6(  1.3) 
1.59C0.02) 


STMLL 

32. 5(  1.6) 
1.56(0.01) 


x-ow:": 


0.0 


SEX 
HALE 


FEHALE 


4410  694799(  2X) 


4397  713248(  IX) 


6.1(  1.0) 
1.42(0.03) 


10.8(  1.9) 
1.34(0.02) 


14. 5(  2.1) 
1.61(0.03) 


7.9(  1.7) 
1.46(0.04) 


11. 3(  1.7) 
1.52(0.02) 


16. 8(  l.O 
1.52(0.02) 


32. 6(  1.8) 
1.50(0.01) 


6.2(  1.0)  12. 2(  2.2)  15. 1(  2.4)  6.8(  1.6)  10. 8(  1.4)  16. 3(  1.4)  32. 5(  1.7) 
1.63(0.03)    1.50(0.03)    1.79(0.03)    1.63(0.04)    1.68(0.04)    1.65(0.03)  1.66(0.02) 


0.0 


0.0 


mtacm/RACE 

MUTE 


BLACK 

HISPANIC 

OTNER 


5931         1016633(  IX) 


1298  196446(  2X) 


il69  1523351  5X) 


409 


42633(  7X) 


6.6(  1.1) 
1.56(0.03) 

4.1(  1.31 
1.34(0.03 

6.9(  2.5) 
1.36(0.06) 

3.9(  2.2) 
1.35(0.12) 


4.0(  X.6) 
1.54(0.03) 

34. 9(  6.5) 
1.33(0.03) 

19. 5(  5.2) 
1.39(0.03) 

7.8(  1.7) 
1.49(0.08) 


16. 3(  2.5) 
1.73(0.02) 

7.5(  2,2) 

I.  55(0.09* 

II.  7(  2.4) 
1.62(0.05) 

23. 5(  4.0) 
1.68(0.07) 


5.9(  1.6) 

I.  63(0.04) 

II.  0(  3.3) 

I.  37(0.03) 

II.  6(  3.2) 
1.46(0.05) 

10. 5(  2.6) 
1.65(0.06) 


12. 1(  1.6) 
1.63(0.02) 

4.1(  1.2) 

I.  44(0.09) 

II.  0(  3.3) 
1.46(0.06) 

17. 9(  5.4) 
1.62(0.05) 


16. 8f  1.3) 
1.63(0.02) 

14. 0(  2.2) 
1.43(0.07) 

18. 4(  4.2) 
1.44(0.02) 

16. 4(  2.8) 
1.59(0.07) 


36. 4(  1.9) 
1.61(0.01) 

24. 5(  3.9) 
1.39(0.02) 

20. 8(  3.5) 
1.48(0.04) 

19. 9(  3.3) 
1.57(0.05) 


0.0 
0.0 
0.0 
0.0 


PARENTAL  EDUCATIO/I 
NOT  GRADUATED  H.S. 


GRADUATED  H.3. 
POST  H*S« 
UNKNQNN 


5E6 


77745(  5X) 


1793  275493(  4^^) 


3364  549929(  3X) 


3067  495435 (  2X) 


10. 0(  1.8) 
1.35(0.10) 

9.2(  1.4) 
1.53(0.04) 

4.e(  0.6) 
1.61(0.04) 

5.3(  1.2) 
1.50(0.05) 


13. 4(  2.5) 

I.  29(0.05) 

II.  0(  2.3) 
1.42(0.05) 

10. 1(  7.7) 
1.50(0.03) 

13.1(  2.5) 
1.38(0.03) 


?.5(  0.9) 
X.!^0(0.13) 

7.2(  1.4) 
1.60^0.06) 

22. K  3.4) 
1.76(0.02) 

12. 7(  2.1) 
1.63(0.03) 


5.7(  1.3) 
1.43(0.08) 

5.2(  1.3) 
1.46(0.05) 

7.3(  1.6) 
1.63(0.04) 

9.0(  2.1) 
1.52(0.04) 


10. 0(  1.5) 
1.50^0.10) 

9.8(  1.4) 
1.56(0.04) 

10. 5(  2.0) 
1.66(0.03) 

12. 5(  1.6) 
1.56(0.03) 


20. 3(  2.7) 
1.46(1.04) 

14. 8(  1.6) 
1.56(0.04) 

16. 3(  1.4) 
1.65(0.03) 

17;4(  1.6) 
1.55(0.03) 


37. 0(  3.8) 
1.46(0.04) 

42.8(  2.7) 
1.56(0.02) 

28. 9(  1.5) 
1.67(0.02) 

30. Of  1.7) 
1.52(0.01) 


0.0 
0.0 
0.0 
0.0 


9  YEARS  OLD 

10  OR  OLDER 


5795         1026813(  V/.} 


2937  371667(  Z'/.} 


5.9(  0.9)  11. 4(  1.0)  15. 7(  2.4)  7.8(  1.8) 
1.54(0.04)    1.44(0.02)    1.72(0.02)  1.57(0.03) 


7.1(  1.1) 
1.49(0.04) 


11.7.  2.3) 
1.37(0.04) 


11. 9(  1.8) 
1.63(0.05» 


5.8(  1.21 
1.50(0.04) 


10. 9(  1.7) 

I.  62(0.02) 

II.  5(  1.6) 
1.54(0.03) 


16. 4(  1.2) 
1.61(0.03) 

17. 0(  1.9) 
1.53(0.03) 


31. 9(  1.5) 
1.60(0.01) 

34. 9(  2.2) 
1.53(0.01) 


0.0 


0.0 


658 


625 


659 


Table  15(51) 

HJfL^'^  READING  Am  NRITINe  ASSESSHEKT  -  ^TUOEKT  qUESTIONNAIRE  -  4TH  GRADERS 
HEIGHTEO  RESmiSE  PERCENTAGES  AND  GENERAL  WRITING    A,R,H,  MEANS  -  REPORTING  VARIABLES 

PARENTAL  EDUCATION 


—  TOTAL 


N 

8750 


WEIGHTED  N 
13986031  1X1 


NOT  HS 

5.6(  0.31 
1.43(0.031 


GRAD  HS 

19. 7(  0.8) 
1.54(0.011 


POST  HS 

39. 3(  1.01 
1.66(0.011 


IMKNOHN 

35. 4(  0,8) 
1.53(0.01) 


x-onrr 

0.7 


SEX 
HAU 


FEMALE 


4381  690366  (  2X) 


4369  708237(  2X) 


5.2(  0.4) 
1.35(0.04) 

5.9(  0.5) 
1.51(0.03) 


20. 0(  1.0) 
1.45(0.01) 

19. 4C  0.9) 
1.64(0.02) 


41. 3(  1.1) 
1.58(0.01) 

37. 4(  1.1) 
1.75(0.02) 


33. 4(  1.2) 
1.45(0.02) 

37. 4(  1.0) 
1.59(0.01) 


0.6 


0.7 


ETHNICITT/RACE 
WHITE 


BLACK 

HISPANIC 

OTNER 


5910         1012419(  IX) 


1282  194141 (  2X) 


1155  150310 (  5X) 


403 


41733(  6X) 


5.0t  0.4) 
1.49(0.03) 

5.7(  0.8) 
1.32(0.04) 

9.9(  1.3) 
1.32(0.07) 

4.0(  1.0) 
1.33(0.17) 


19, 8(  0.9) 
1.60(0.02) 

21. 4(  1.5) 
1.3810.02) 

19. 1(  1.8) 

I.  42(0.03) 

II.  8(  1.6) 
1.49(0.06) 


40. 0(  1.2) 
1.72(0.01) 

38. 3(  2.0) 
1.44(0.02) 

34. 5(  1.7) 
1.56(0.03) 

44. 7(  3.0) 
1.69(0.04) 


35. 2(  1.0) 
1.58(0.02) 

34.6(  1.8) 
1.34(0.03) 

36. 6(  2.0) 
1.42(0.02) 

39. 5(  3.2) 
1.57(0.05) 


0.4 

.  Z 
1.3 
2.1 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOWN 


526 


77745 (  5X) 


1793  275493 (  4X) 


3364  549929(  3X) 


3067  495435(  2X) 


100. 0(  0.0) 
1.43(0.03) 

0.0(  0.0) 

)HH(llll(0.0  ) 

0.0(  0.0) 
«ff»«iiC0.0  ) 

0.0(  0.0) 
»*»»»( 0.0  ) 


0.0(  0.0) 
»«#««( 0.0  ) 

100. 0(  0.0) 
1.54(0.01) 

0.0(  0.0) 
>»**»( 0.0  ) 

o.or  0.0) 

«««««( 0.0  ) 


0.0(  0.0) 
0.0  ) 

0.0(  0.0) 

IHIffff«(0.0  ) 

100. 0(  0.0) 
1.66(0.01) 

0.0(  0.0) 
«««««( 0.0  ) 


0.0(  0.0) 
«««««( 0.0  ) 

0.0(  0.0) 
«««««( 0.0  ) 

0.0(  0.0) 
«««««( 0,0  ) 

100. 0(  0.0) 
1.53(0.01) 


0.0 
0.0 
0.0 
0.0 


AGE 

«  YEARS  OLD 


10  OR  OLDER 


ERLC 


5758         1020049(  IX) 


2917  368987 (  2X) 


4.7(  0.3) 
1.45(0.03) 

8.1(  0.8) 
1.41(0.04) 


18. 7r  0.9) 
1.56(0.01) 

22. 7(  1.0) 
1.50(0.02) 


660 


626 


40. 8r  1.0) 
1.68(0.01) 

75. ir  1.2) 

1.60(0.02) 


35. 9(  0.9) 
1.54(0.02) 

34. 0(  1,1) 
1.48C0.02) 


0.7 


o.y 


Table  15(52) 

HEIGHTEO  RESPONSE  PERCENTAGES  AND  GENERAL  WRITING  A.R.M.  NEANS  -  REPORTING  VARIABLES 
PERCENT  AT  OR  ABOVE  ANCHOR  POINTS 


TOTAL 


N 
8807 


n'EIGHTEO  N 
1^080^71  1X1 


I.O 
92. M  0.^1 


2.0 
15. 9C  0.61 


3.0 
O.Of  0.0) 


4.0 
O.OC  0.0) 


SEX 
HALE 
^EHALE 

ETHNICmr/RACE 
MilTE 
BUCK 
HISPANIC 
OTHER 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 
GRADUATED  H.S. 
POST  H.S. 
UNKNOWN 

AGE 

9  YEARS  OLD 

10  OR  OLDER 


^10 
W97 


5931 
1293 
1169 
^9 


526 
1793 
3364 
3067 


5795 
2937 


69*7991  2X) 
7132481  1X1 


10166331  IX) 
196h46(  2X) 
1523351  5X) 
426331  7X) 


77745C  5X: 

275<;93(  4X) 

5499291  3X) 

4954;5(  2X) 


10268131  IX) 
3716671  2X) 


89.7C  0.7) 
95. 0(  0.4) 


94. 6(  0.5) 

83.7C  1.4) 

88. 7C  0.9) 

92. 7(  1.8) 


86. 8(  2.0) 

91. 7C  0.6) 

94.71  0.5) 

91. 2(  0.6) 


93.21  0.4) 
90. 1(  0.9) 


10. 6( 
21. 2C 


0.6) 

C.9) 


1£.8C  0.7) 

6.1(  0.9) 

10. 3(  1.3) 

13.4C  2.1) 


8.4C  0.9) 

14.2(  0.9) 

21. 2(  0.8) 

12. 5(  0.81 


17. 4C  0.6) 
12. 1(  0.8) 


O.Of 
C.0( 


0.0) 
0.0) 


O.OC  0.0) 

0.0(  0.0) 

O.OC  0.0) 

O.OC  0.0) 


O.OC  0.0) 

O.OC  0.0) 

O.Cf  0.0) 

O.OC  0.0) 


O.OC  0.0) 
O.OC  0.0) 


O.OC 
O.OC 


0.0) 
0.0) 


O.OC  0.0) 

O.OC  0.0) 

O.OC  0.0) 

O.OC  0.0) 


O.OC  0.0) 

O.OC  0.0) 

O.OC  0.0) 

O.OC  0.0) 


O.OC  0.0) 
O.OC  0.0) 


661 


ERIC 


627 


Table  15(53) 

MAEP  READIN6  AND  HRUING  ASSESSHENT    -    STUDEKT  QUESTIONMAIRE    -    STH  GRADERS 

MEIGHTIO  RESPOKSE  PERCENTAGES  AND  GENERAL  'MTING    A*R*H,  MEANS  -  REPORTING  VARIABLES 

XraVtED  STUDENT  GRADE 


—  TOTAL 


N 

11092 


HEIGH TED  N 
1662192C  IX) 


GRADE  6 

100,0c  0.0) 
2.05(0,01) 


•/C-ONIT 
0.9 


SEX 
HAU 


e39776C  IX)         100. OC  0.0) 
1.96C0.01) 


0.0 


FENALE 


5606  842416C  IX)         100. 0(  0.0) 

2.14(0.01) 


0.0 


ETNNICITY/RACE 
HHITE 


BUCK 

HISPANIC 

OTHER 


7916         1249e37(  IX) 


1500 


405 


232931(  2X) 


1271  150212(  4X) 


49212(  4X) 


100. 0(  0.0) 
2.11(0.01) 

100. 0(  0.0) 
1.66(0.01) 

100. 0(  0.0) 
1.67(0.02) 

100. 0(  0.0) 
2.09(0.03) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S« 


UMCNOHN 


1072 


5036 


1000 


152336(  5X) 


3667  5640  33(  3X) 


765037(  3X) 


144665(  5X) 


100. 0(  0.0) 
1.69(0.02) 

100. 0(  0.0) 
2.02(0.01) 

100. 0(  0.0) 
2.13(0.01* 

100. 0(  0.0) 
1.90(0.02) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  OLD 


14  OR  OLDER 


7420         1155603(  IX) 


3563  513626(  2X) 


100. 0(  0.0) 
2.06(0.01) 

100. 0(  0.0) 
1.96(0.01) 


0.0 


0.0 


6  6  2628 


Table  15(54) 

ISrLTirfZSLSl^^il?^  ^  NRITINB  ASSESSHEMT  -  STUDEKT  QUESTIONNAIRE  -  BTH  GRADERS 
WI6HTE0  RESPONSE  PERCENTAGES  AND  GENtRAL  HRITING    A.R.H.  MEANS  -  REPORTING  VARIABLES 

STUDENT  SEX 


TOTAL  — 


N 

11092 


NEI6HTED  N 
x662192(  IX) 


KALE 

49. 9(  0.61 
1.96(0.011 


FEMALE 
50. IC  0.61 

za^co.oii 


X-OMIT 
0.0 


SEX 
MALE 


FEMALE 


5486          6397761  IX)         100. Of  0.0)  0.0(  0.0)  0.0 

1.96C0.01)  iHitf)H(C0.0  ) 

5606           8424161  IX)            O.OC  0.0)  100. OC  0.0)  0.0 

<«HHHfC0.0  )  2.14(0.01) 


ETWNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


7916         12498371  IX) 


1500 


405 


232931 C  2X) 


1271  1502121  4X) 


492121  4X) 


50. 3C  0.7) 
2.02(0.01) 

48. 3(  1.3) 
1.73(0.02) 

48. 8(  :..5) 
1.77(0.02) 

49. 9(  2.6) 
2.00(0.05) 


49.7(  0.7) 
2.20(0.01) 

51. 7(  1.3) 
1.94(0. OS) 

51. 2(  7  i) 
1.97(0.J2) 

50. 1(  2.6) 
2.19(0.03) 


0.0 
0.0 
0.0 
0.0 


PARENTAL  LDUCATION 
NOT  GRADUATED  M.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNONN 


1072  152336(  5X) 


3887  584033(  3X) 


5038  785037(  3X) 


1000  144685(  5X) 


43. 6(  1.6) 
1.79(0.03) 

49. 5(  0.8) 
1.93(0.01) 

50. 7t  0.9) 
2.04(0.01) 

52. 9(  1.6) 
1.84(0.03) 


56.4(  1.6) 
1.97(0.02) 

50. 5(  0.8) 
2.11(0.01) 

49. 3(  0.9) 
2.23(0.01) 

47. 1(  1.6) 
1.93(0.03) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  OLD 


14  0»4)ER 


''^20         1155803(  IX) 


3583 


13626 (  2X) 


47. 0(  0.6) 
1.9<>(0.01) 

56. 5(  1.2) 
1.91(0.01) 


53. 0(  0.6) 
2.16(0.01) 

43. 5(  1.2) 
2.07(0.02) 


0.0 


O.d 


ERLC 


629 

863 


Table  15(55) 

NAEP  1961-M  READING  AW  NRITIN6  ASSESS^itKT  -  STUDEKT  QUESTIONNAIRE  -  8TH  GRADERS 
MEIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  URITIN6    A.R.H.  MEANS  -  REPORTING  VARIABLES 

EinNicirr/i^ACE 


—  TOTAL  — ■ 


N 

11092 


NEIGHTED  N 
1682192C  IX) 


NHITE 

74,3C  0,4) 
2.11C0.01) 


BUCK 

13. 8C  0.2) 
1.86C0.01) 


HISPANIC 

e.9C  0.3) 
1.87C0.02) 


ANER  IND 

I.IC  0.1) 
2.08C0.05) 


ASIAN 

1.8C  0.2) 
2.11C0.04) 


UNCUSS 

O.OC  0.0) 
1.88C0.28) 


Z-OHIT 
0.0 


SEX 
HALE 


FENALE 


5^6 
5606 


839776 C  IX) 

842416 c  r/.y 


74. 9C  0.6) 
2.02C0.01) 

73.7C  0.5) 
2.20(0.01) 


13. 4C  0.4) 
1.78C0.02) 

14. 3C  0.3) 
1.94(0.02) 


e.7C  0.4) 
1.77C0.02) 

9.1C  0.5) 
1.97(0.r2) 


1.2C  0.2) 
1.97C0.08) 

l.OC  0.2) 
2.20(0.05) 


1.7C  0.2) 
2.02C0.n6) 

1.9C  0.3) 
2.18(0.05) 


O.OC  0.0) 

KKKKIfCO.O  ) 

O.OC  0.0) 
1.88C0.28) 


0.0 


0.0 


ETHNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


7916 
1500 
1271 
405 


12'9837C  IZ) 
232931^  2X) 
150212 C 
49212 C  4X) 


100. OC  0.0) 
2.11C0.01) 

O.OC  0.0) 
)f)i)i)M(C0.0  ) 

O.OC  0.0) 
iKiffffffCO.O  ) 

O.OC  0.0) 
»if)iffffC0.0  ) 


O.OC  0.0) 
^HfffffffCO.O  ) 

100. OC  0.0) 
1.&6C0.01) 

O.OC  0.0) 

KIIKKKtO.O  ) 

O.OC  0.0) 

IHHH(«C0.0  ) 


O.OC  0.0) 
»«KiiiiC0.O  ) 

O.OC  0.0) 

IHHIffllCO.O  ) 

100. OC  0.0) 
1.87C0.02) 

O.OC  0.0) 

MfKKKCO.O  ) 


O.OC  0.0) 

4H(lflf«C0.0  ) 

O.OC  0.0) 
kkkkkCO.O  ) 

O.OC  0.0) 

KKKKKCO.O  ) 

36. 4C  4.9) 
2.08C0.05) 


O.OC  0.0) 

4H(«»«CO.O  ) 

O.OC  0.0) 

4H(II»«C0.0  ) 

O.OC  0.3) 

4H(llllllC0.0  ) 

61. OC  5.2) 
2.11C0.04) 


O.OC  CO) 
ififiiiiiiC  0.0  ) 

O.OC  0.0) 
iHHdmCO.O  ) 

O.OC  0.0) 
if»«ififCO.O  ) 

0.6C  0.5) 
1.88C0.28) 


0.0 


0.0 


0.0 


0.0 


i^ARENTAL  tOUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S. 
UNKNOWN 


1072 
3887 
5038 
1000 


152336C  Sy.} 
504033C  3X) 
785037C  3X) 
144685C  5X) 


57. 6C  2.7) 
1.95C0.02) 

77.2C  0.9) 
2.06C0.01) 

79. 2C  0.6) 
2.18C0.01) 

52. 5C  2.6) 
1.98C0.02) 


18. IC  1.4) 
1.76C0.05) 

13. 9C  0.7) 
l.e7C0.02) 

12. IC  0.5) 
1.92C0.02) 

19. 2C  1.9) 
1.78C0.05) 


21. 2C  3.0) 
1.84C0.04) 

7.0C  0.5) 
1.89C0.03) 

5.5C  0.3) 
1.96C0.02) 

23. IC  2.6) 
1.79C0.03) 


2.2C  0.5) 
1.92C0.11) 

I.IC  0.2) 
2.05C0.07) 

0.9C  0.2) 
2.19C0.09) 

I.IC  0.2) 
2.06C0.19) 


0.9C  0.3) 
1.95C0.07) 

0.8;  0.2) 
2.00C0.10) 

2.2C  0.3) 
2.17C0.06) 

4.2.  0.8) 
2.01C0.09) 


i^.OC  0.0) 
M»««iiC0.0  ) 

O.OC  0.0) 
0.0  ) 

O.OC  0.0) 
1.86(0.28) 

O.OC  0.0) 
ifif»«ifC0.0  ) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  OlD 


14  OR  OLDER 


(er|cB4 


7420 
3583 


1155d03C  IX) 
513626C  2X) 


77. 7C  0.3) 
2.13C0.01) 

66. 9C  0.9) 
2.05C0.01) 


12. IC  0.3) 
1.90C0.02) 

17.6'  0.6) 
1.81C0.02) 


630 


7.2(  0.2) 
1.91C0.02) 

13. OC  0.9) 
1.83C0.02) 


I.IC  0.2) 
2.09C0.07) 

1.3C  0.2) 
2.04C0.05) 


2.0C  0.2) 
2.15C0.05) 

1.3C  0.2) 
1.95C0.06) 


O.OC  0.0) 
1.88C0.28) 

O.OC  0.0) 
iHnmiiC  0.0  ) 


0.0 


0.0 


665 


REGION 


—  TOTAL  — 


N  MEIGHTED  N 

11092         I6e2192C  IX > 


HE 

22. 7C  0.5) 
2.09C0.01) 


SE 

23. IC  1.5) 
2.03C0.02) 


CEHTRAL 

26,3C  1,^^) 
2.06C0.01) 


NEST 

27. 6C  0.6) 
2.03C0.02} 


SEX 
HAUE 


FEHALE 


RTWNICITY/RACE 
UHITE 


BUCK 

HISPANIC 

OTHER 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


5^  e39776(  IX) 


5S06  642<»l'dC  IX) 


7916         1249e37(  IX) 


1500  232931?  2X) 


1273  15021 2(  ^X) 


405 


POST  H.S, 


UNKNONN 


AGE 

13  YEARS  OLD 


14  OR  CLOER 


49212(  4X) 


1072  1523361  5X) 


3M7  5d4033(  3X) 


503d  7650371  3X) 


1000  1446651  5X) 


7420         11558031  IX) 


3583  5136261  2X) 


23.31  0.7) 
2.0010,02) 

22.11  0.6) 
2.1610.01) 


23,71  0,2) 
2.1310.01) 

22.61  0.8) 
1.9110.03) 

15.51  3.6) 
1.9110.03) 

20.01  5.6) 
2.1410.06) 


18.31  1.9) 
1.9410.04) 

24.61  l.A) 
2.0710.02) 

22.61  1.1) 
2.1610.01) 

21.51  2.7) 
1.9010.03) 


23.21!  0.4) 
2.1110.01) 

2CI.S1  1.3) 
2.0210.02) 


22.91  1.6) 
1.9510.02) 

23.31  1.6) 
2.1110.02) 


21.71  1.7) 
2.1110.02) 

41.21  0.8) 
1.8410.02) 

10.41  4.2) 

I.  9110.05) 

II.  41  2.8) 
2.0610.06) 


30.61  2.9) 
1.6610.03) 

21.81  1.7) 
1.9810.02) 

23.71  2.1) 
2.1310.03) 

19.21  2.6) 
1.9110.05) 


22.61  1.6) 
2.0610.02) 

24.11  1.6) 
1.9410.02) 


2t.31  1.5) 
1.9610.01) 

26.41  1.5) 
2.1610.02) 


29.91  1./: 
2.0910.01) 

20.81  3.5) 
1.6410.04) 

7.41  2.0) 
1.6110.06) 

19.51  2.9) 
2.1010.05) 


22*61  2.7) 
1.92(0.04) 

30.21  2.1) 
2.0210.01) 

25.31  2.0) 
2  1310.02) 

22.21  2.7) 
1.9510.03) 


26.91  1.5) 
2.0610.02) 

25.4(  1.6) 
2.0010.02) 


27.M  0.7) 
1.9510.01) 

26.21  0.9) 
2.1110.02) 


24.61  0.3) 
2.1010.02) 

15.41  3.4) 
1.88(0.03) 

66.71  1.4) 
1.8710.02) 

40.11  6.6) 
2.0610.06) 


26.21  3.6) 
1.8610.04) 

23.51  1.6) 
2.0110.02) 

28.21  1.4) 
2.1110.02) 

37.11  3.0) 
1.8710.02) 


27.11  0.5) 
2.0710.02) 

29.71  1.3) 
1.9610.02) 


0.0 
0.0 

0.0 
0.0 
0.0 
0.0 

0.0 
0  ) 
0.0 
0.0 

0.0 
0.0 


ERIC 


see 


631 


Table  15(57) 

NAEP  19e3-©4  READING  AMD  MRITIMG  ASSESSMeMT  -  STUDENT  QUESTIONMAIRE  -  fiTW  6RA0ERS 
WEIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  WRITING    A.R.H.  HEANS  •  REPORTD^  VARIABLES 


IMPUTED  STUDENT  AGE 


N 

WEIGHTED 

N 

11-LESS 

12 

13 

14 

15 

16-MORE 

X-OMIT 

—  TDTAL 

11092 

1682192( 

I'A) 

O.OC  0.0) 
Z.19I D • 3c  1 

0.7C  0.1) 

68. 7C  0.5) 
9  sAf n  nil 

26. OC  0.5) 

3.9C  0.2) 
1.86C  0.02 ) 

0.7C  0.1) 
1.79C0.05) 

0.0 

MALE 

5486 

639776 ( 

IX) 

O.OC  0.0) 
«iiii#«C  0.0  ) 

0.7C  0.2) 
1.96C0.10) 

64. 7C  0.7) 
1.99C0.01) 

29. 2C  0.7) 
1.93C0.01) 

4.6C  0.3) 
1.61C0.03) 

0.8C  0.2) 
1.69C0.09) 

0.0 

FEMALE 

5606 

e42416C 

IX) 

O.OC  0.0) 
2.19(0.32) 

0.8C  0.2) 
2.15C0.06) 

72. 7C  0.7) 
2.16C0.01) 

22. 7C  0.6) 
2.09C0.02) 

3.2C  0.3) 
l.VgCO.Cj) 

0.6C  0.1) 
91C0.09) 

0.0 

ETMNICITY/RACE 
WHITE 

7916 

1249837C 

IX) 

O.OC  0.0) 
Z.19I 0 • 3Z  1 

0.7C  0.2) 

9   1  ct  n   flA  \ 

71. 8C  0.4) 
9  1  Tf  n  nil 

24. 6C  0.5) 
?  .0710  oil 

2.6C  0.1) 
1.93C  0.03) 

0.3C  0.1) 
1.95C0. 08) 

0.0 

BLACK 

1500 

232931( 

2X) 

O.Ot  0.0) 
y«4f»«(0.0  ) 

1.3C  0.4) 
1.90C0.06) 

59. 8C  1.2) 
1.90C0.02) 

27. 6C  1.3) 
1.82C0.0;?) 

6.7C  0.9) 
1.79C0.06) 

2.5C  0.6) 
1.72C0.05) 

0.0 

HISPANIC 

1271 

150212( 

4X) 

O.OC  0.0) 

4H»tfl*»f0.0  ) 

0.5C  0.2) 
1.66C0.16) 

55. IC  1.9) 
1.91C0.02) 

35. 9C  1.2) 
1.65C0.03) 

7.3C  1.7) 
1.76C0.03) 

1.2C  0.7) 
1.64C0.30) 

0.0 

OTHER 

405 

49212( 

4X) 

O.Ot  0.0) 
iiii»#ii(0.0  ) 

0.6C  0.3) 
2.03C0.12) 

72. 9C  2.1) 
2.13C0.04) 

23. OC  1.9) 
2.00C0.04) 

2.9C  0.7) 
1.94C0.11) 

0.4C  0.3) 
1.95C0.46) 

0.0 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S* 

1072 

152336( 

5X) 

O.K  0.1) 
2.10(2.24) 

0.7C  0.4) 
1.91C0.19) 

52. 7C  1.6) 
1.92C0.03) 

34. IC  1.2) 
1.87C0.03) 

10. 7C  1.1) 
1.63C0.05) 

1.7C  0.5) 
1.75C0.14) 

0.0 

GRADUATED  H*S* 

3867 

584033C 

3X) 

O.Ot  0.0) 
2.27(0.30) 

0.7C  0.2) 
2.13C0.07) 

67.4C  0.9) 
2.05C0.01) 

27. 7C  0.8) 
1.96C0.01) 

3.7C  0.3) 
1.86C0.05) 

0.5C  0.1) 
1.97C0.06) 

0.0 

POST  H.S. 

5038 

7650 37C 

3X) 

O.OC  0.0) 
tHi«««(0.0  ) 

0.8C  0.2) 
2.06C0.07) 

74. 9C  0.6) 
2.15C0.01) 

22. IC  0.7) 
2.09C0.02) 

1.6C  0.2) 
1.90C0.05) 

0.4C  0.1) 
1.79C0.07) 

0.0 

UNKNOWN 

1000 

144665( 

5X) 

O.OC  0.0) 
iiii«ii«CC.O  ) 

0.8C  0.3) 
1.99C0.09) 

57. 6C  2.2) 
1.93C0.02) 

30. 9C  1.6) 
1.67C0.03) 

8.6C  1.0) 
1.83C0.04) 

1.6C  0.5) 
1.70C0.09) 

0.0 

AGE 

13  YEARS  DLO 

7420 

1155$03( 

IX) 

O.OC  0.0) 
IHI«««C0.0  ) 

O.OC  0.0) 
«H(ii««C0.0  ) 

100. OC  0.0) 
2.06C0.01) 

O.OC  0.0) 
IHi4i««C0.0  ) 

O.OC  0.0) 
uttimiiCO.O  ) 

O.OC  0.0) 

IM»IM»«C0.0  ) 

0.0 

14  OR  OLDER 

3583 

5m26C 

2X) 

O.OC  0.0) 
C  0.0  ) 

O.OC  0.0) 
4*IH(««C0.0  ) 

O.OC  0.0) 
IH(«««C0.0  ) 

85. OC  0.9) 
2.0010.01) 

12. 7C  0.6) 
1.36C0.02) 

2.2C  0.4) 
1.79C0.05) 

0.0 

er|c  ^^"^ 


632 


Table  15(58) 

JI^?..^?^''*!!"^^  HRmHB  ASSESSHEWr  -  STU)EMT  qUESTIONNAIRE  -  tJH  6PADERS 
HEISHTCO  RESraiSE  PERCEHTA6ES  AHD  GENERAL  HRITIN6    A.R.H.  MEANS  -  REPORTINB  VARIABLES 

sxzemPE  OF  cqmnunity 


TOTAL 


N 

11092 


HEI6HTED  N 
1662192(  IX) 


RURAL 

5.1(  1.1) 
2.03(0.03) 


DIS  URB 

6.5(  1.5) 
1.86C0.02) 


AOV  URB 

10. 6(  2.4) 
2.21(0.02) 


BIS  CITY 

10. 6(  3.5) 
2.01(0.02) 


FRINGE 

16. 6(  3.1) 
2.07(0.02) 


riEOIUH 

15. 3(  2.7) 
2.04(0.03) 


SMALL 

33.0(  2.3) 
2.05(0.01) 


Z-OHIT 


0.0 


SEX 
HALE 


FEMALE 


5486  639776(  V/.} 


5606  842416(  IX) 


5.01  1.0)  7.e(  1.5)  11. 1(  2.5)  10. e(  3.6)  17.5(  3.2)  15.3(  2.9)  32. 4(  2.3)  0.0 

1.94(0.03)  1.76(0.03)  2.14(0.02)  1.93(0.03)  1.96(0.02)  1.95(0.03)  1.96(0.01) 

5.2(  1.1)  9.2(  1.6)  10.5(  2.3)  10. 3(  3.5)  16. 1(  3.0)  15. 2(  2.7)  33.5(  2.4)  0.0 

2.11(0.03)  1.96(0.03)  2.26(0.03)  2.11(0.02)  2.16(0.02)  2.13(0.03)  2.15(0.01) 


ETHNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OmER 


7916         1249637(  IX) 


1500 


405 


232931 (  2X) 


1271  150212(  4X) 


49212(  4X) 


5.9(  1.2)  2.6(  0.9)  12.6(  2.9)  9.3(  3.5)  16. 2(  3.3) 

2.06(0.0?)  2.04(0.03)  2.22(0.02)  2.06(0.02)  2.10(0.02) 

4.2(  1.9)  30. 2(  4.9)  2.9(  1.0)  13.4(  4.6)  6.9(  2.7) 

1.61(0.05)  1.62(0.02)  2.15(0.07)  1.66(0.03)  1.9?(0.04) 

l.K  0.5)  20. 9(  9.0)  5.4(  1.6)  14. 7(  7.6)  16. 0(  6.1) 

1.68(0.12)  1.61(0.03)  2.02(0.06)  1.90(0.03)  1.91(0.04) 

2.6(  1.0)  17. 6(  4.1)  13. 2(  3.7)  14. 6(  4.7)  22. 4(  4.9) 

2.01(0.11)  1.99(0.06)  2.23(0.04)  1.99(0.06)  2.15(0.07) 


13. 9(  2.2)  37. 3<  2.6)  0.0 

2.11(0.02)  2.09(0.01) 

16. 5(  3.1)  24. 0(  3.4)  0.0 

1.69(0.04)  1.65(0.03) 

26.0(13.3)  15. 6(  3.2)  0.0 

I.  86(0.04)  1.67(0.04) 

II.  2(  3.6)  16. 2(  3.1)  0.0 
2.04(0.13)  2.14(0.06) 


PARENTAL  EDUCATII3N 
NOT  GRADUATED  M.S. 


GRADUATED  H.S. 
POST  H.S. 


tMKNONN 


1072  152336(  5X) 


3887  584033(  3X) 


5038  765037(  3X) 


1000 


144665(  5X) 


5.7(  1.7) 
1.88(0.07) 

6.6(  1.4) 
1.99(0.02) 

3.6(  0.9) 
2.14(0.03) 

5.1(  1.4) 
1.95(0.06) 


12. 0(  2.6) 
1.81(0.06) 

8.3(  1.5) 
1.88(0.04) 

6.5(  1.2) 
1.95(0.03) 

17. 0(  4.3) 
1.79(0.06) 


1.3(  0.5) 
2.04(0.11) 

4.9(  1.1) 
2.13(0.02) 

16. 5(  3.3) 
2.25(0.02) 

6.1(  1.4) 
2.12(0.05) 


7.6(  2.7) 

I.  65(0.06) 

10. 2(  3.3) 
2.02(0.03) 

II.  6(  4.2) 
2.05(0.02) 

10. 4(  3.4) 
1.90(0.06) 


13.5(  3.7) 
1.91(0.03) 

16. 3(  3.1) 
2.04(0.02) 

17. 9(  3.1) 
2.14(0.02) 

16.0(  3.7) 
1.91(0.04) 


16. 3(  5.9) 
1.69(0.04) 

13. 7(  2.7) 
2.01(0.02) 

15. 5(  2.4) 
2.12(0.03) 

16. 7(  5.3) 
1.93(0.04) 


41. 5(  3.4) 
1.91(0.03) 

39. 6(  2.6) 
2.04(0.02) 

26. 2(  2.4) 
2.14(0.02) 

24.6(  2.7) 
1.69(0.04) 


0.0 


0.0 


0.0 


0.0 


AGE 

13  YEARS  OLD 


14  OR  OLDER 


7420         1155803(  IX) 


3583  513626(  2X) 


5.1(  1.0) 
2.06(0.03) 

5.2(  1.3) 
1.95(0.04) 


7.9(  1.4) 
1.92(0.03) 

9.6(  2.1) 
1.61(0.03) 


12. 2(  2.6) 
2.22(0.02) 

7.4(  1.9) 
2.16(0.03) 


11. 0(  3.6) 
2.05(0.02) 

9.5(  3.3) 
1.92(0.04) 


16. 7(  3.1) 
2.09(0.02) 

16. 6(  3.3) 
2.01(0.02) 


14. 0(  2.3) 
2.07(0.02) 

16. 2(  3.9) 
1.96(0.03) 


33. 0(  2.3) 
2.09(0.01) 

33.3(  2.7) 
1.99(0.02) 


0.0 


0.0 


ERLC 


669 


633 


670 


Table  15(59) 

NAEP  19eS-M  READING  AND  NRITIN6  ASSESSHENT  *  STUDENT  QUESTIONNAIRE  -  8TH  GRADERS 
HCIGHTED  RESPONSE  PERCENTAGES  AND  GENERAL  NRITENG    A.R.N.  HEANS  -  REPORTING  VARIABUS 


pa:?ental  education 


N 

HEIGHTED  N 

NOT  NS 

GRAD  HS 

POST  HS 

UNKNONN 

X-ONIT 

—  TOTAL  — 

10997 

1666092C 

IX) 

9.1t  J.4) 
1  89f0.02) 

35. IC  1.1) 
2.02C0.01) 

47.1C  1.2) 
2.13C0.01) 

8.7C  0.5) 
1.90C0.02) 

1.0 

SEX 
HALE 

5429 

830370C 

2X) 

8.0(  0.5) 
1.79C0.03) 

34. 8C  1.0) 
1.93C0.01) 

48. OC  1.2) 
2.04C0.01) 

9.2C  0.5) 
1.84C0.03) 

1.1 

FENALE 

5568 

835722C 

IX) 

10. 3C  0.5) 
1.97C0.02) 

35. 3C  1.2) 
2.11C0.01) 

46. 3C  1.4) 
2.23C0.01) 

8.1C  0.6) 
1.98(0.03) 

0.8 

ETHNICITY/RACE 
HHITE 

7042 

1236555( 

IX) 

7.1C  0.4) 
1.95(0.02) 

36. 5C  1.2) 
2.06C0.01) 

50. 3C  1.4) 
2.18C0.01) 

6.1C  0.4) 
1.98C0.02) 

1.1 

BLACK 

1492 

231722( 

2X) 

11. 9C  0.9) 
1.76C0.05) 

3"  OC  1.9) 
l.v  ^C0.02) 

41. IC  1.7) 
1.92C0.02) 

12. OC  1.4) 
1.78C0.05) 

0.5 

HISPANIC 

1266 

149566 ( 

4X) 

21. 6C  2.7) 
1.84C0.04) 

27. 3C  3.1) 
1.89C0.03) 

28. 8C  2.5) 
1.96C0.02) 

22. 3C  2.5) 
1.79C0.03) 

0.4 

OINER 

397 

48248C 

4X) 

9.8C  2.2) 
1.93(0.08) 

23. 2C  3.0) 
2.03C0.05) 

51. 3C  3.5) 
2.17C0.06) 

15. 7C  2.6) 
2.02C0.08) 

2.0 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 

1072 

152336 ( 

5X) 

100. 0(  0.0) 
1.89( 0.02) 

0.0(  0.0) 
«««K«C0.0  ) 

O.OC  0.0) 
)HHIIH(C0.0  ) 

O.OC  0.0) 

WHHf^CO.O  ) 

0.0 

GRADUATED  H.S. 

3887 

3X) 

0.0(  0.0) 
MHi^CO.O  ) 

100. OC  0.0) 
2.02C0.01) 

O.OC  0.0) 

«»IHH(C0.0  ) 

O.OC  0.0) 
IH(««»C0.0  ) 

0.0 

POST  H.S. 

5038 

785037( 

3X) 

0.0(  0.0) 

IHHHHffO.O  ) 

O.OC  0.0) 
«»«IH(C0.0  ) 

100. OC  0.0) 
2.13C0.01) 

O.OC  0.0) 
ffffff)HfC0.0  ) 

0.0 

UNKNOWN 

1000 

14<*685( 

5X) 

O.OC  0.0) 

IHMHH(C0.0  ) 

O.OC  0.0) 

IHIKKKCO.O  ) 

O.OC  0.0) 
««»»»C0.0  ) 

100. OC  0.0) 
1.90C0.02) 

0.0 

AGE 

13  YEARS  OU) 

7359 

1144958( 

IX) 

7.0C  «.3) 
1.92C0.03) 

34. 4C  1.2) 
2.05C0.01) 

51. 3C  1.3) 
2.15C0.01) 

7.3C  0.4) 
1.93C0.02) 

0.9 

14  OR  OLDER 

3i  ^« 

508371 ( 

2X) 

13.9C  0.8) 
1.86(0.02) 

36. 7C  1.1) 
1.97C0.01) 

37.5C  1.4) 
2.07C0.02) 

11. 8C  1.0) 
1.86C0.02) 

1.0 

67i 


ERIC 


634 


mp  IMS-M  REiOZNB  AID  iBTTTMs  ««.e«Mr»  Table  15(60) 

^  M<XTIf«    A.R.H.  nEAKS  -  REPORTING  VARIABLES 

PERCENT  AT  OR  ABOVE  ANCHOR  POINTS 


TOTAL 


SEX 
NALE 
FENALE 

ETHNICITY/RACE 
HHITE 
BUCK 
HISPANIC 
OTHER 

PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 
GRADUATED  H.S. 
POST  H.S. 
UNKNOWl 

AGE 

13  YEARS  OLD 

14  OR  OLDER 


N 

11092 


5486 
5606 


7916 
1500 
1271 
405 


1072 
3667 
503d 
1000 


7420 
3583 


NEIGHTED  N 
16d2192(  IX) 


839776(  IX) 
B42416(  IX) 


1249837(  IX) 
232931(  2X) 
1502121  <^Xl 
492I2(  4X) 


1523361  5X) 

584033(  3X) 

7850371  3X) 

144685(  5X) 


11558031  IX) 
5136261  2X) 


1.0 
99.5(  0.1) 


99.1(  0.1) 
99.9(  0.1) 


99.8(  0.1) 
98. 7(  0.3) 
98. 5C  0.4) 
99. 9C  0.1) 


99.1C  0.3) 

99. 6(  0.1) 

99. 7C  0.1) 

99.0C  0.4) 


99. 7C 
99. IC 


0.1) 
0.1) 


2.0 
55. 0(  0.6) 


47. 0(  0.8) 
63. 0(  0.8) 


60. 6C  0.7) 
36. 3(  1.8) 
36. 8(  l.«) 
57. 2C  4.0) 


40. 5(  1.9) 
51. 9C  0.8) 
63.1C  0.9) 
39. 5C  1.5) 


57. 6(  0.7) 
49.1(  1.0) 


3.0 
1.0(  0.1) 


0.5(  0.1) 
1.4(  0.1) 


1.2(  0.1) 

O.IC  0.0) 

0.3(  0.2) 

1.6(  0.7) 


0.0(  CO) 
0.4C  0.1) 
1.6C  0.2) 
0.6f  0.3) 


4.0 
0.0(  0.0) 


0.0( 
O.Of 


0.0) 
0.0) 


0.0(  0.0) 

O.Of  0.0) 

O.Of  0.0) 

O.Of  0.0) 


O.Of  0.0) 

O.Of  0.0) 

O.Of  0.0) 

O.Of  0.0) 


1.2f 
0.5f 


0.1) 
0.1) 


O.Of 

o.or 


0.01 
0.0) 


ERLC 


672 

635 


Table  15(61) 

NAEP    19S3-«4  WADING  AKD  HRITIHB  ASSESSHEMT    -    ^TUOEW  QUESnOtlAIR^^ 

NEIGHTED  RESPONSE  PBICENTAGES  AMD  GENERAL  WRITING    A.R.M.  MEANS  -  REPORTING  VARIABLES 

nmJTED  STUDENT  GRADE 


—  TOTAL  — 


N 

10657 


HEIGHTEO  N 


GRADE  11 

100. OC  0.01 
2.19(0.011 


X-OhlT 
0.0 


SEX 
tULE 


FCHALE 


5215 


5442 


7144181  2X1 
715823i  2X1 


100.01  0.01 
2.09(0.011 

100. 0(  0.01 
2.29(0.011 


0.0 


3.0 


ETHNICITY/RACE 
UNITE 


BUCK 

HISPANIC 

OTHER 


7692 
1476 
902 
385 


1077699(  IX) 

205670(  2X) 

107250(  3X) 

39422(  4X) 


100. 0(  0.0) 
2.24(0.01) 

100. 0(  0.0) 
2.00(0.02) 

100. 0(  0.0) 
2.00(0.02) 

100. 0(  0.0) 
2.16(0.03) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H*S. 


POST  H.S. 
UNKNCtt^ 


1267 
3675 
5312 
290 


159736(  5X) 
4791 73(  3X) 
740465(  3X) 
37067(  r/) 


100. 0(  0.0) 
1.99(0.02) 

100. 0(  0.0) 
2.15(0.01) 

100. 0(  0.0) 
2.27(0.01) 

100. 0(  0.0) 
1.99(0.03) 


0.0 
0.0 
0.0 
0.0 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


O    8  OR  OLDER 


ERIC 


1102 
7919 
1636 


184232(  6X) 
963071(  IX) 
262938(  3X) 


100. 0(  0.0) 
2.23(0.02) 

100. 0(  0.0) 
2.21(0.01) 

100. 0(  0.0) 
2.08(0.02) 


0.0 


0.0 


0.0 


636  673 


,  Table  15(62) 

SIUDOIT  SEX 


—  TOTAL  — 


M  ' 
10657 


MEIGHTED  N 


MALE 

59. OC  0.9} 
2*09(0.011 


FEMALE 

50. 0(  0.9) 
2.29(0.011 


Z-OMIT 
0.0 


SEX 
MALE 


FEMALE 


5215          7mieC  2X)         100. 0(  0.0)  O.OC  0.0) 

2.09(0.01;  »««)H((0.0  ) 

5442           7156231  2Z)            0.0(  0.0)  100.0(  0.0) 

IH(KKK(0.0  )  2.29(0.01) 


0.0 
0.0 


ETHNICITY/RACE 
NHITE 


BUCK 

HISPANIC 

OTHER 


7892 


1478 


902 


385 


1077899(  VA) 
2056701  2X) 
107250(  3X) 
394221  4X) 


49. 6(  0.9) 
2.14(0.01) 

49. 2(  1.9) 
1.91(0.02) 

52. 1(  2.0) 
1*92(0.03) 

56. 5(  3.0) 
2.09(0.04) 


50.4C  0.9) 
2.35(0.01) 

50.8(  1.9) 
2.09(0.02) 

47.9(  2.0) 
2.09(0.03) 

43.5(  3.0) 
2.26(0.06) 


0.0 
0.0 
0.0 
0.0 


PAREMTAL  EDUCATION 
NOT  CRADUATEO  H.S. 


GRADUATED  H.S. 


POST  H.S. 


UNKNOItl 


1267 
3675 
5312 
290 


159736(  5X) 

479173(  3X) 

740485(  3X) 

37087(  7X) 


46. 6(  1.4) 
1.90(0.03) 

50. 0(  0.9) 
2.05(0.01) 

50. 1(  1.3) 
2.16(0.01) 

58. 1(  3.3) 
1.93(0.04) 


53.4(  1.4) 
2.06(0.03) 

50. 0(  0.9) 
2.25(0.02) 

49. 9C  1.3) 
2.37(0.01) 

41. 9(  3.3) 
2.07(0.06) 


0.0 
0.0 
0.0 
CO 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


IS  OR  OLDER 


1102 
7919 
1636 


184232(  6X) 
963071(  IX) 
282938(  y/.) 


43. 2(  2.2) 
2.11(0.03) 

47.7(  0.9) 
2.11(0.01) 

62. 1(  1.8) 
2.02(0.02) 


56.8(  2.2) 
2.32(0.02) 

52. 3(  0.9) 
2.30(0.01) 

37.9(  1.8) 
2.18(0.03) 


671" 


0.0 


0.0 


0.0 


Table  15(63) 

MAEP  1963-64  READXN6  AND  NRnXNS  ASSESSHENT  -  STWEMT  qOESTXONNAXRE  -  111W  GRADERS 
HEXGHTEO  RESPONSE  "ERCENTASES  AND  GENERAL  URXTXNG    A.R.H.  MEANS  •  REPORTXNS  VARXABIES 

EINNXCXTY/RACE 


TOTAL 


N 

10657 


NEXGHTED  N 
1430241(  IX) 


WHITE 

75.4C  0.3) 
2.24C0.01) 


BLACK 
2.00C0.02) 


HXSPANXC 

7.5C  0.2) 
2.00C0.02) 


AMER  XND 

0.9C  0.1) 
2. loco. 05) 


ASIAN 

1.9C  0.1) 
2.19C0.04) 


UNCUSS 

O.OC  0.0) 
2.04C0.34) 


X-OHXT 
0.0 


SEX 
HALE 


FEMALE 


5215 


5442 


71441S(  2X) 


71582 3(  2X) 


74. 9C  0.6) 
2.14C0.01) 

75. e(  0.6) 
2.35(0.01) 


14. 2C  0.6) 
1.91(0.02) 

14. 6(  0.5) 
2.09(0.02) 


7.e(  0.3) 
1.92(0.03) 

7.2(  0.3) 
2.09(0.03) 


l.K  0.1) 
2.02(0.05) 

0.6(  0.1) 
2.25(0.09) 


2.0(  0.2) 
2.13(0.07) 

l.e(  0.2) 
2.27(0.07) 


0.0(  0.0) 
lH(if»«(0.0  ) 

0.0(  0.0) 
2.04(0.34) 


0.0 
0.0 


ETWIXCXTY/RACE 
HHXTE 


BUCK 

HXSPANXC 

OTHER 


7692         1077899(  IX) 


1476 


902 


385 


205670(  2X) 


107250(  3X) 


39422(  4X) 


100. 0(  0.0) 
2.24(0.01) 

0.0(  0.0) 
««i(iiff(0.0  ) 

0.0(  0.0) 

4H()()I«(  0.0  ) 

0.0(  0.0) 
«)()()()(( 0«0  ) 


0.0(  0.0) 
)!««)()(( 0.0  ) 

100. 0(  0.0) 
2.00(0.02) 

0.0(  0.0) 

4H(«)I«(0.0  ) 

0.0(  0.0) 

#tflHl«(0.0  ) 


0.0(  0.0) 
»^)()((0.0  ) 

0.0(  0.0) 

««4H(K(0.0  ) 

100. 0(  0.0) 
2.00(0.02) 

0.0(  0.0) 
{Hl»«)((0.0  ) 


0.0(  0.0) 

IHH(ffff(0«O  ) 

0.0(  0.0) 

Kfflfllll(0.0  ) 

0.0(  0.0) 
«««»«( 0.0  } 

31. 5(  3.2) 
2.10(0.05) 


0.0(  0.0) 
«««««( 0.0  ) 

0.0(  0.0) 
«««««( 0.0  ) 

0.0(  0.0) 

IH(ll««(0.0  ) 

68. 0(  3.2) 
2.19(0.04) 


0.0(  0.0) 
ffiiifiiii(0.0  ) 

0.0(  0.0) 

IH(lflf«(0.0  ) 

0.0(  0.0) 

IH(IH(^(0.0  ) 

0.5(  0.3! 
2.04(0.34) 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 


POST  H.S. 
UNKNOWN 


1267  159736(  5X) 


3675  479173(  3X) 


5312  740485(  3X) 


290 


37087(  7X) 


53. 9(  2.7) 
2.07(0.03) 

77. 0(  0.9) 
2.19(0.01) 

80. 2(  0.8) 
2.3^^(0.01^ 

44. e(  4.4) 
2.05(0.05) 


20. 1(  1.8) 
1.85(0.04) 

15. 5(  0.6) 
1.98(0.03) 

a.9(  0.7) 
2.08(0.02) 

25. 5(  2.8) 
1.94(0.06) 


24. 0(  2.5) 
1.91(0.04) 

5.5(  0.7) 
2.02(0.03) 

4.6(  0.5) 
2.10(0.03) 

22. 4(  3.3) 
1.89(0.07) 


0.6(  0.2) 
1.85(0.15) 

1.0(  0.1) 
2.05(0.11) 

0.9(  0.1) 
2.18(0.06) 

1.3(  0.7) 
2.01(0.24) 


1.4(  0.3) 
2.06(0.16) 

1.0(  0.2) 
2.07(0.08) 

2.4(  0.2) 
2.25(0.05) 

6.0(  1.5) 
2.07(0.13) 


0.0(  0.0) 
IH(llllll(0.0  ) 

0.0(  0.0) 

IH(««ll(0.0  ) 

0.0(  0.0) 
2.04(0.34) 

0.0(  0.0) 
«««««( 0.0  ) 


0.0 
0.0 
0.0 
0.0 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 


ERLC 


OR  OLDER 


675 


1102 


ld4232(  6X) 


7919  963071(  IX) 


1636  28293d(  3X) 


74. 9(  1»9) 
2.28(0.02) 

79.4(  0.3) 
2.25(0.01) 

61. 8(  1.5) 
2.16(0.02) 


16. 4(  1.9) 
2.06(0.04) 

11. 7(  0.2) 
2.02(0.03) 

22. 3(  1.0) 
1.93(0.03) 

638 


5.8(  1.3) 
2.09(0.06) 

6.4(  0.2) 
2.02(0.03) 

12. 2(  0.8) 
1.93(0.03) 


0.4(  0.2) 
2.10(0.13) 

0.9(  0.1) 
2.11(0.07) 

1.01  0.2) 
2.06(0.09) 


2.5(  0.4) 
2.23(0.10) 

1.5(  0.1) 
2.23(0.04) 

2.6(  0.3) 
2.09(0.07) 


0.0(  0.0) 

IHHHI«(0.0  I 

0.0(  0.0) 
2.04(0.34) 

0.0(  0.0) 
IH(ll)H((0.0  ) 


0.0 


0.0 


0.0 


676 


RESXQN 


TOTAL 


N  NEXGHTEO  N 

10657         1430241C  Ixl 


HE 

24. 6C  0.51 
2.22C0.03I 


«l.9C  I, 91 
2.16(0.02) 


CENTRAL 

27. 5C  1.6) 
2.20C0.02) 


NEST 

26. OC  0.7) 
2.I7C0.01) 


sex 


rEHALE 


rrwNxcmr/RACE 
HHm 


BLACK 

HISPANIC 

OTHER 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


6RADUATE0  H.S. 


POST  N.S. 


UNKNOHN 


AGE 

16  OR  YOUNGER 


5215  n4416C  Z7it 


5442  n5d25C  2X) 


7692         1077699(  IX) 


1476  205670(  JX) 


902  107250(  Zy.} 


365  59422(  4X) 


1E67  1597361  BY.} 


3675  479173(  3)^) 


5312  740465(  3X) 


290 


II02 


17  YEARS  OLD 


16  OR  OLDER 


37067t  ny 


184232 (  6X) 


7919  9630711  j:<) 


1636  2629361  3X) 


ERIC 


25. 5C  0.7) 
2.12C0.02) 

24. IC  0.9) 
2.32C0.04) 


26. 2C  0.3) 
2.26C0.03) 

24. 6(  0.7) 
2.04C0.04) 

13. 4C  5.0) 
1.95C0.03) 

17. 4C  3.3) 
2.15(0.06) 


22. 4(  2.7) 
2.03(0.07) 

25. 5(  E.O) 
2.17(0.02) 

25. 3(  1.6) 
2.29(0.03) 

21. 6(  3.6) 
2.02(0.05) 


32. 1(  2.6) 
2.25(0.04) 

24. 3(  0.4) 
2.24(0.03) 

21.61  1«9) 
2.11(0.03) 


21. 2(  2.0) 
2.05(0.02) 

22. 7(  2.1) 
2.25(0.02) 


19. 9(  2.2) 
2.23(0.02) 

40. 6(  0.9) 

I.  96(0.03) 

II.  0(  7.3) 
2.03(0.06) 

10. 3(  2.6) 
2.20(0.10) 


26. 2(  2.5) 
1.95(0.03) 

21. 6(  2.1) 
2.11(0.02) 

20. 2(  2.7) 
2.25(0.02) 

2?.6(  4.0) 
1.95(0.07) 


26. 2(  3.7) 
2.21(0.04) 

20.5(  1.9) 
2.19(0.02) 

24. 0(  2.1) 
2.02(0.03) 


26. 5(  1.6) 
2.10(0.02) 

26. 2(  2.2) 
2.30(0.03) 


30. 9(  2.1) 
2.23(0.02) 

16. IC  3.3) 
2.00(0.03) 

10. 2(  4.7) 
1.97(0.11) 

22. 9(  3.7) 
2.14(0.06) 


22. 0(  3.0) 
2.02(0.04) 

32. 1(  2.6) 
2.17(0.02) 

25. 4(  2.2) 
2.26(0.02) 

24. 0(  4.6S 
2.00(0.10) 


16. 6(  3.2) 
2.24(0.05) 

29. 1(  1.7) 
2.22(0.02) 

27.1(  3.2) 
2.12(0.05) 


26. 6(  1.2) 
2.07(0.01) 

25. 1(  1.3) 
2.27(0.02) 


23. 0(  0.3) 
2.24(0.01) 

16. 5(  3.7) 
2.00(0.03) 

65. 4(  1.3) 
2.01(0.02) 

49. 4(  5.0) 
2.17(0.05) 


27.4(  2.6) 
1.95(0.04) 

20. 6(  1.5) 
2.J13(0.02) 

29.1(  1.5) 
2.24(0.01) 

31. 4(  2.9) 
1.96(0.05) 


23.1(  2.1) 
2.22(0.04) 

26. 2(  0.7) 
2.19(0.01) 

27. 1(  1.7) 
2.07(0.02) 


9.9 
9.9 

9.9 
9.9 
9.9 
9.9 

0.0 
0.0 
0.0 
0.0 

0.0 
0.0 
0.0 


677 


639 


Table  15(65) 

Mali  ifeS-M  KCAOXNO  AND  HUnXNO  ASSESSnCNT  -  ^*"^J?L.;^^I!I«5!i?^r 
MIIOHTIO  WSWKSi  raCEMTASES  AND  SENERAL  MRXTXND   A.R.H.  HEANS  -  REP0IITIN6  VARIABLES 

XNRITEO  STUDENT  A6E 


TOTAL 


N 

106S7 


NEX6HTE0  N 
mOE41(  IX) 


15-LESS 

O.Ef  0.0) 
2.30(0.14) 


16 

1E.7(  0.7) 
2.23(0.02) 


17 

67. 3(  0.4) 
2.21(0.01) 


18 

17. 3(  0.5) 
2.10(0.02) 


19 

2.1(  0.2) 
1.94(0.04) 


20-nORE 

0.4(  0.1) 
1.99(0.11) 


X-OMXT 
0.0 


SEX 
HALE 


FCffALE 


5215  714416(  2X) 


5442  715a23(  2X) 


O.K  0.1) 
2.16(0.16) 

0.2(  0.1) 
2.39(0.17) 


11. 0(  0.0) 
2.11(0.03) 

14. 4(  0.9) 
2.32(0.03) 


44. 2(  0.7) 
2.11(0.01) 

70.4(  0.6) 
2.30(0.01) 


21. 4(  0.7) 
2.04(0.02) 

13. 2(  0.6) 
2.20(0.03) 


2.7(  0.3) 
1.09(0.05) 

1.4(  0.2) 
2.04(0.07) 


0.5(  0.1) 
1.94(0.121 

0.3(  0.1) 
2.06(0.15) 


0.0 
0.0 


ETMNICITY/RACC 
miTE 


BLACK 


NISPANIC 


7692         1077699(  IX) 


1476  205670(  2X) 


902 


365 


107250(  3X) 


39422(  4X) 


O.K  0.0) 
2.34(0.15) 

0.2(  0.1) 
2.20(0.22) 

0.3(  0.3) 
2.24(1.79) 

0.9(  0.4) 
2.29(0.14) 


12. 7(  0.6) 
2.26(0.02) 

14.5(  1.7) 
2.06(0.04) 

9.6(  2.1) 
2.08(0.05) 

12. 6(  1.6) 
2.20(0.09) 


71. 0(  0.4) 
2.25(0.01) 

54. 6(  1.0) 
2.02(0.03) 

57. 7(  1.6) 
2.02(0.03) 

60. 6(  2.1) 
2.19(0.03) 


14. 9(  0.7) 
2.17(0.02) 

24. 0(  1.1) 
1.95(0.03) 

26.41  1.6) 
1.94(0.04) 

20. 9(  1.7) 
2,07(0.06) 


l.K  0.1) 
2.01(0.06) 

5.7(  1.0) 
1.67(0.06) 

4.5(  1.1) 
1.66(0.10) 

3.2(  0.6) 
2.13(0.19) 


0.2(  0.1) 
2.23(0.20) 

1.0(  0.4) 
1.79(0.15) 

1.4(  0.4) 
1.90(0.17) 

1.7(  0.7) 
2.07(0.26) 


o.e 


0.0 


0.0 


0.0 


PARl-HTAL  EDUCATION 
NOT  MADUATEO  H.S. 


WAOUATED  H.S. 


POSY  N.S. 
UNKNOWN 


1267  159736(  5X) 


367b  479173(  3X) 


5312  7404651  3X) 


290 


37067(  7X) 


0.2f  0.1) 
2.13(0.42) 

0.2(  0.1) 
2.35(0.15) 

0.2(  0.1) 
2.30(0.20) 

0.0(  0.0) 

IHHHHI(0.0  ) 


7.7(  1.2) 
2.07(0.06) 

11. 4(  1.0) 
2.15(0.03) 

14. 6(  0.6) 
2.29(0.02) 

9.7(  2.0) 
2.07(0.15) 


55. 6(  2.1) 
2.02(0.02) 

66. 4(  0.9) 
2.17(0.01) 

69. 6(  0.7) 
2.26(0.01) 

54. 6(  3.0) 
1.99(0.04) 


29.4(  2.0) 
1.94(0.03) 

17. 9(  0.6) 
2.09(0.02) 

13. 7(  0.7) 
2.20(0.03) 

27. 7(  2.9) 
1.96(0.10) 


6.2(  1.0) 
1.63(0.10) 

1.6(  0.3) 
1.99(0.09) 

1.2(  0.2) 
2.02(0.06) 

5.7(  1.6) 
1.96(0.10) 


0.9(  0.3) 
1.620. 16) 

0.3(  0.1) 
2.09(0.21) 

0.3(  0.1) 
2.03(0.25) 

2.3(  0.9) 
1.96(0.23) 


0.0 
0.0 
0.0 
0.0 


ABC 

16  OR  Y0UN6ER 


17  YEARS  OLD 


Q    OR  OlDER 

678 


1102 


164232(  6X) 


7919  963071(  IX) 


1636         262936(  3X) 


1.4(  0.4) 
2.30(0.14) 

0.0(  0.0) 
IHIIIIIII(0.0  ) 

0.0(  0.0) 
•WHHKO.O  I 


96. 6(  0.4) 
2.23(0.02) 

0.0(  0.0) 
•Hlllllll(0.0  ) 

0.0(  0.0) 
••ii»«(0.0  I 


$40 


0.0(  0.0) 

IMHHHI(0.0  ) 

100. 0(  0.0) 
2.21(0.01) 

•.0(  0.0) 
IHHHHKO.O  ) 


0.0(  0.0) 

IHIIHIIIfO.O  ) 

0.0(  0.0) 
««»««( 0.0  ) 

67. 3(  1.0) 
2.10(0.02) 


0.0(  0.0) 

IIIHHIII(0.0  ) 

0.0(  0.0) 

•IHHIII(0.0  ) 

10. 5(  0.9) 
1.94(0.04) 


0.0(  0.0) 

«HHIIIII(0.0  ) 

0.0(  0.0) 
«««»ii(0.0  ) 

2.2(  0.3) 
1.99(0.11) 


0.0 


0.0 


0.0 


679 


SSH^S^Sri^S^  -    STUDCHT  QUESTIONNAIRE  ^^n^^i^^^^ 

HEXfiNTED  RESPONSE  PERCENTAGES  AND  SENERAL  NRHING   A.R.H.  HEANS  <  REPORTING  VARIABLES 

SUE/TYPE  OF  COHNUNmr 


TOTAL 


N  NEIGNTED  N 

10657         1430241 (  1X1 


RURAL 


DIS  URB 


AOV  URB 


BIG  CITY 


FRINGE 


NEDIUH 


SNAIL 


X-OHIT 


2.13(0.03)    2.01(0.02)    2.28(0.02)    2.18(0.02)    2.19(0.02)    2.21(0.02)  2.19(0.01) 


SEX 
NALE 


FEHALE 


ETHNICITY/RACE 
WITE 


BLACK 

HISPANIC 

OTHER 


5215  714418(  2X) 


5442  715823(  2X) 


7d92         1077899(  IX) 


147d  205670 (  2X) 


902  107250(  3X) 


5.7(  1.2)  9,2(  2.0)  18. 6(  2.6)  7.4(  2.1)  9.7(  2,7)  16. 7(  2.1)  32. 7(  1  8) 
2.03(0.04)    1.92(0.02)    2.19(0.02)    2,05(0.03)    2.07(0.03)    2.10(0.02)  2.09(0.02) 


385 


39422(  4X) 


5.4(  1.2) 
2.23(0.04) 


5.3(  1.1) 
2.20(0.03) 

7,5(  3.5) 
1.94(0.04) 

4.8(  3.0) 
1.99(0.15) 

3.1(  1.5) 
1.96(0.24) 


10. 7{  2.5) 
2.09(0.02) 


4.0(  1.5) 
2.13(0.03) 

30. 6(  8.9) 
1.97(0.02) 

27. 2(  8.9) 
1.90(0.03) 

18. 0(  5.1) 
2.07(0.05) 


15. 0(  2.9) 
2.40(0.03) 


18. 6(  3.0) 
2.31(0.02) 

8.8(  3.1) 
2.13(0.06) 

12. 5(  6.5) 
2.05(0.05) 

19. 5(  5.1) 
2.30(0.08) 


10. 2(  2.5) 
2.27(0.02) 


7.2(  2.0) 
2.25(0.02) 

12. 3(  3.9) 
2.03(0.03) 

16. 0(  6.7) 
2.06(0.04) 

14. 5(  4.0) 
2.23(0.09) 


10. 1(  2.7) 
2.30(0.03) 


10. 4(  2.8) 
2.22(0.03) 

8.7(  3.7) 
2.07(0.06) 

7.0(  3.3) 
2.03(0.05) 

12. /(  4.4) 
2.20(0.07) 


17.1(  1.2) 
2.31(0.03) 


17. 5(  1.2) 
2.26(0.02) 

12. 9(  2.6) 
2.00(0.05) 

19. 1(  8.5) 
2.03(0.03) 

14. 9(  2.0) 
2.09(0.08) 


31. 5(  1.8) 
2.30(0.02) 


36. 9(  2.0) 
2.22(0.02) 

19. 2(  3.6) 
1.97(0.04) 

13. 4(  4.8) 
2.01(0.06) 

17. 4(  2.8) 
2.12(0.06) 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


PARENTAL  EDUCATION 
NCI  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNXNOUN 


1267  159736(  5X) 


3675  479173(  3X) 


5312  7404851  3X) 


290 


37087(  r/) 


d.7(  2.5) 
1.94(0.07) 

6.9(  1.4) 
2.14(0.04) 

3.4(  0.7) 
2.24(0.05) 

7.0(  ?  2) 
1.92(0.12) 


18. 4(  4.6) 
1.87i0.03) 

10. 6(  2.5) 
2.01(0.02) 

7.2(  1.6) 
2.09(0.03) 

23. 9(  6.4) 
l.?3Cu,  06) 


4.3(  1.1) 
2.00(0.12) 

8.6(  1.9) 
2.19(0.04) 

24. 6(  4.1) 
2.32(0.02) 

7.3(  2.4) 
2.03(0.11) 


7.8(  2.1) 
2.01(0.05) 

8.4(  2.1) 
2.14(0.04) 

9.2(  2.4) 
2.24(0.04) 

11. 8(  3.3) 
2.02(0.13) 


6.0(  2.0) 
2.02(0.07) 

10. 1(  2.9) 
2.14(0.03) 

10. 7(  2.9) 
2.25(0.03) 

12. 7(  4.0) 
2.uo(0.09) 


15. 6(  2.8) 
1.99(0.06) 

14. 9(  1.6) 
2.17(0.02) 

18. 8(  1.7) 
2.27(0.02) 

13. 0(  2.9) 
1.97(0.14) 


39. 0(  2.6) 
2.03(0.03) 

40. 4(  2.2) 
2.17(0.02) 

26.0^  1.8) 
2.28C0.02) 

24. 2(  3.0) 
2.00(0.06) 


0.0 


0.0 


0.0 


0.0 


AGE 

16  OR  YOUNGER 


17  YEARS  OLD 
IB  OR  OLDER 


EMC 


680 


1102  184232(  6X) 


7919  963071(  IX) 


1636  2B293B(  3X) 


4.7(  1.3)  10. 5(  3.3)  22. 0(  4.3)  11. 0(  3.2)  10. 5(  3.0)  17. 6(  2.9)  23. 8(  2  8) 

2.13(0.06)  2.03(0.04)  2.30(0.03)  2.20(0.06)  2.21(0.04)  2.25(0.05)  2.28^0.04) 

5.3(  1.2)  7.8(  1.8)  17. 2(  2.8)      8.4(  2.1)  10. 8(  3.0)  17. 2(  1.3)  33. 2(  1  5) 

2.16(0.04)  2.04(0.03)  2.30(0.03)  2.21(0.03)  2.20(0.02)  2.22(0.02)  2.21(0.01) 

6.8(  1.7)  16.B(  3.5)  11. 8(  2.2)      8.9(  2,4)  6.7(  2.0)  15. 2(  2.':)  33. B(  2  5) 

2.04(0.05)  1.94(0.03)  2.19(0.06)  2.C6(0.0:S)  2.11(0.06)  2.14(0.05)  2.10(0.03) 

641 


0.0 


0.0 


0.0 


681 


Table  15(67) 

NAEF  mS-M  RHADIMB  AND  WXTIN8  ASSESSMCMT  -  STUDENT  qUCSTIOWAIRE  -  ^^^^^fl^^^ 
NEXGHTEO  RESPONSE  PERCENTAGES  AND  GENERAL  NRITING    A.R.H.  HEAN5  -  REPORTING  VARIABUS 


PARENTAL  EDUCATION 


TOTAL 


N 

10544 


NEIGHTEO  N 
1416480C  1X1 


NOT  HS 

11.3C  0.6) 
1.99C0.02) 


GRAO  HS 

33. SC  1.2) 
2.15C0.01) 


POST  HS 

52. 3C  1.3) 
2.27C0.01) 


UTKNONN 

2.6C  0.2) 
1.99C0.03) 


SEX 
KAU 


FENALE 


5150 


706374C  2X) 


5394  710106C  2X) 


10.5C  0.5) 
1.90C0.03) 

12.0C  O.S) 
2.06C0.03) 


33.9C  1.2) 
2.05C0.01) 

33. 7C  1.4) 
2.25(C»02) 


52.5(  1.4) 
2.16C0.01) 

52. IC  1.5) 
37(0. 01) 


3.1C  0.2) 
1.93C0.04) 

2.2C  0.3) 
2.07(0.06) 


1.1 

o.a 


ETHNICmr/RACE 
HHITE 


BLACK 

HISPANIC 

OTHER 


7769         1065566(  IX) 


1472  204531 (  2X) 


900 


363 


107100(  3X) 


392e3(  4X) 


S.K  0.6) 
2.07(0.03) 

15. 7(  1.5) 
1.65(0.04) 

35. 6(  4.6) 
1.91(0.04) 

6.1(  1.2) 
2.00(0.11) 


34. 6(  1.3) 
2.19(0.01) 

36. 4(  2.1) 
1.98(0.03) 

24. 7(  2.7) 
2.02(0.03) 

23.6(  2.6) 
2.06(0*07) 


55. 7(  1.6) 
2.30(0.01) 

43.2(  2.3) 
2.06(0.02) 

31. 7(  3.3) 
2.10(0.03) 

61. 2(  3.5) 
2.23(0.04) 


1.6(  0.2) 
2.05(0.05) 

4.6(  0.6) 
1.94(0.06) 

7.6(  1.4) 
1.69(0.07) 

6.9(  1.5) 
2.06(0.12) 


1.1 
0.6 
0.1 
0.4 


PARENTAL  EDUCATION 
NOT  GRADUATED  H.S. 


GRADUATED  H.S. 
POST  H.S. 
UNKNOHN 


1267  159736 (  5X) 


3675  479173(  3X) 


5312  740465(  3X) 


290 


37067(  TA) 


100. 0(  0.0) 
1.99(0.02) 

0.0(  0.0) 
IHI4HI«(0.0  ) 

0.0(  0.0) 

«HHl)Hl(0.0  ) 

0.0(  0.0) 

IHHHM((0.0  ) 


0.0(  0.0) 
«««««( 0.0  ) 

100. 0(  0.0) 
2.15(0.01) 

0.0(  0.0) 
MIIHHKO.O  ) 

0.0(  0.0) 

IHHI»«(0.0  ) 


0.0(  0.0) 

«HHIIH((0.0  ) 

0.0(  0.0) 

IHHI#)IC0.0  ) 

100. 0(  0.0) 
2.27(0.01) 

0.0(  0.0) 
IHHIIHKO.O  ) 


0.0(  0.0) 

)HHHHI(0.0  ) 

0.0(  0.0) 

IHHHI)i(0.0  ) 

0.0(  0.0) 

IHHHHtfO.O  ) 

100. 0(  0.0) 
1.99(0.03) 


0.0 
0.0 
0.0 
0.0 


AGE 

16  OR  T0UN6ER 


17  TEARS  OLD 


O    3R  OLDER 


EMC 


1092  162674(  6X) 


7634  953727(  1%) 


1616 


279'J79(  3K) 


6.9(  0.9) 
2.07(0.06) 

9.3(  0.6) 
2.02(0.02) 

20.6(  1.5) 
1.92C0.03) 


30. 3(  2.0) 
2.16(0.03) 

34.4(  1.3) 
2.17(0.01) 

34.2C  1.3) 
2.06C0.02) 

682^5 


60.6(  2.3) 
2.29(0.02) 

54. 2(  1.4) 
2.26(0.01) 

40. 2(  1.7) 
2.16(0.03) 


2.0(  0.4) 
2.07(0.15) 

2.1(  0.2) 
1.99(0.04) 

4.7(  0.6) 
1.96(0.06) 


0.7 
1.0 
1*1 


Tdl)!©  X3^68) 

IIJ!^LTr^«™Jf  ^  ^^^^^  ASSESSMENT  -  STUDEKT  QUESTlOTfMIRE  -  IITH  GRADERS 
WEIGHTED  RESPOr^'SE  PERCENTAGES  MID  GENERAL  WRITING    A.R.H.  MEA»^S  -  REPORTING  VARIABLES 

PERCENT  AT  OR  ABOVE  ANCHOR  POINTS 


—  TOTAL 


N 

10657 


WEIGHTED  N 
1^302^11  IX) 


1.0 
99. 6C  0.1) 


2.0 
66. 3(  0.6) 


3.0 
3.<iC  0.2) 


^.0 
0.0(  0.0) 


SEX 
rULE 
FEHALE 

ETHNICITY/RACE 
UIITE 
BUCK 
HISPANIC 
OTHER 

PARENTAL  EDUCATIOT^ 
NOT  GRADUATED  H.S. 
6J?A0UATED  H.S. 
POST  H.S. 

AGE 

16  OR  YOUNGER 

17  YEARS  OLD 
16  OR  OLDER 


5215 
5^^2 


7692 
1^76 
902 
365 


1267 
3675 
5312 
290 


1102 
7919 
1636 


7mi6(  ZZ) 
7158231  ZX) 


1077699C  r/,) 

2056701  ZZ) 

1072501  3^) 

394221  ^y.) 


159736C  5Z) 

^791731  3X) 

7^04651  3X) 

370O7(  7'/,) 


16^232 (  67.) 
963071C  r/.) 
282936C  IX) 


99. 3(  0.1) 
99. 6(  0.0) 


99. 7(  0.1) 

99. 3C  0.3) 

96. 6(  0.3) 

99. 6(  0.2) 


99. 0(  0.3) 

99. 5(  0.1) 

99. 6(  0.1) 

99. 3(  0.7) 


99. 7(  0.1) 
99. 6(  0.1) 
99. 6(  0.2) 


57. 6(  1.0) 
75. OC  0.9) 


71. IC  0.9) 

^6.7(  1.2) 

51. 6(  2.0) 

67. 2(  2.2) 


^8.6(  2.0) 

62. 6(  1.0) 

73. ^(  1.0) 

^6.^(  3.^) 


70. 7(  1.6) 
66. 5(  0.9) 
56. 1(  1.9) 


1.6(  0.2) 
4.9(  0.4) 


4.0(  0.3) 

1.2(  0.3) 

1.0(  0.3) 

3.6(  1.2) 


0.9(  0.3) 

2.5C  0.3) 

4.6(  0.4) 

0.5(  0.3) 


4.0(  0.7) 
3.6(  0.3) 
2.0(  0.4) 


0.0(  0.0) 
0.0(  0.0) 


0.0(  0.01 

0.0(  0.0) 

0.0(  0.0) 

0.0(  0.0) 


0.0(  0.0) 

0.0(  0.0) 

0.0(  0.0) 

0.0(  0.0) 


0.0(  0.0) 
0.0(  0.0) 
0.0(  0.0) 


ERLC 


683 

643 


APPENDIX  A 
Assessment  Items 


684 


T«bl« 
R«adin9  Zt«»8 


II     NAIP  ID  DESCRZPTZOll 


ROOllOl  PZCTUItE:CBREAL  WITH  TOT  INSIDE  IS  PAX 

■  001201  LONO  DIST:ltATB  ON  CALL-LOWER  EVENING  RATE 

N001202  LONO  DIST:PERS0N  CALLS  DIPP-OPR  ASSISTED 

N001301  KOLA  COUPON:OOOD  POR  ANT  SISE  CARTON 

N001302  KOLA  COUPON:USE  ON  NOV.    10 r  1970 

N001303  KOLA  COUPON :PATNENT  IS  12  CENTS 

H001401  VERSE:DECK  OP  CARDS  DESCRIBED  IN  POEM 

N001501  nuts:   devil  PUT  PEARL  IN  WALNUT 

N001502  NUTS:   PARN  WIPE  WAS  CLEVER  AND  PRACTICAL 

NO 0150 3  NUTS:  WANTED  TRICK  SOMEONE  INTO  CRACKING  WALNUTS 

N001504  NUTS:   PLAN  WRONG-WOMAN  WAS  TOO  CLEVER  POR  HIM 

N001505  NUTS:    IS  THIS  A  GOOD  ST0RT7 

N001506  NUTS:  WBT  WAS  THIS  A  GOOD  OR  BAD  STORT 

N001601  1ST  AM:BITTER  WINTER-EKTREMELT  COLD 

NO 016 0  2  1ST  AM: ICE  AGE  PEOPLE  DEPENDED  ON  ANIMALS  TO  LIVE 

N001603  1ST  AM:NO  LAND  BRIDGE  NOW-COVERED  WITH  WATER 

N001604  1ST  AM:MAIN  PURPOSE-EZPLN  ICE  AGE  SETTLERS-N.  AM. 

H001605  1ST  AM:HOW  INTERESTING  WAS  THIS  ARTICLE 

N001606  1ST  AM:ROW  HARD  WAS  THIS  ARTICLE  TO  RCAD 

N001701  BOOK  CLUB:SHIPPIN^  COSTS  HIGHER  IN  CANADA 

N001702  BOOK  CLUB:SEND  NO  MONET  TILL  BILLED 

N00170  3  BOOR  CLUB: BUT  6  MORE 

N001801  PLT:WANT  OP  THOUGHT-LACK  OP  THINKING 

N001t02  PLT:PACING  problems  similar  to  HIS  OWN 

NO 019 31  CHARLETl:  MANS  PEARS 

N00190  2  CHARLETl:   MOOD  OP  STORT 

N001903  CHARLETl:   CREATED  MOOD 

N002001  WISH  COULD  PLT:GOSSAMER  CONDOR  1ST  MUSCLE-POWERED 

N002002  WISH  COULD  PLT:BIKE  RACERr    BR7AN  ALLEN  PLEW  CONDOR 

«f 002003  WISH  COULD  PLT : MACCREADT  PLANS  DIPP- SIMPLR/LIGHTR 

N002101  VIRUSES :DXPPICULT  TO  STUDT 

N002102  VIRUSES: CLOTHE  IDEA-GIVE  PROOP  TO  SUPPORT 

N002201  PHONE  BILL:PEB  14  CALL  PROM  ATHENS,  GA 

N002202  PHONE  BILL:PEB  14  CALL  TO  ST  PAUL,  MN 

N002203  PHONE  BILL:PEB  14  CALL  COST  $.75 

N002301  THE  DOOR:THOUGHTS  ON  POEM 

N002401  MOSQUITO:SISE  MOSQUITOES  EXAGGERATED 

N002501  MART:WILL  GET  MONET  PROM  NEITHER 

N002701  ATMCSPHERE:4  WORDS  CUE-PIRST , NEXT , ABOVE r PINALLT 

N00270  2  ATMOSPHERE :SCISNTISTS  KNOW  MOST  ABOUT  TROPOSPHERE 

N002801  BETHUNE:   ROOSEVELT  HONOR  HER  BT  MAKING  HER  DIRECTO 

NO02I0  2  BETHUNE:   START  HER  SCHOOL  TO  EDUCATE  BLACK  CHILDRN 

N002803  BETHUNE:  MOST  IMPORTANT  THINGS  ABOUT  HER  AND  WHT 

N002901  SOCCSR:DID  YOU  LIKE  READING  THIS  ARTICLE 

Nl^0290  2  SOCCER:MOST  POPULAR  BECAUSE  PLATED  BT  MILLIONS 


o  685 

;  1^ 


A(l) 
and  Locations 

Grado  4/A90  9 


Block  Tap* 
H-05  3-07 


H-10 

3 

-01 

H-11 

3 

-02 

H-12 

3 

-03 

H-13 

3 

-04 

H-14 

3 

-05 

H-15 

3 

-0  6 

J-12 

3 

-08 

J-13 

3 

-09 

3 

-10 

J-15 

3 

-11 

J-16 

3 

-12 

J-17 

3 

-13 

J-19 

4 

-14 

J-20 

4 

-15 

K-09 

2 

-10 

K-10 

2 

-11 

K-11 

2 

-12 

K-18 

4 

-05 

K-19 

4 

-06 

L-22 

2 

-0  7 

2 

-13 

L-20 

L-24 

L-25 

L-26 

Grad«  8/A9«  13  Grado  ll/Ago  17 


Block 

Taps 

■  1  MM  I* 

o  1 OC  K 

Taps 

w— U  D 

w— u  / 

^  — «  D 

4      1  ^ 

3  —  26 

n— u  0 

3  —  27 

3-27 

tff  no 

H— U  s 

H-lO 

H— 10 

H-11 

H  — 1 1 

H-12 

B  11 

H  — I  2 

3-21 

H-13 

3-21 

H-13 

H-14 

H-14 

H-15 

H-15 

H-16 

H-16 

H-17 

H-17 

H-18 

H-18 

H-19 

J— 1 1 

3  —  0  8 

3-08 

V  11 
J— 1 2 

3-0  9 

3-09 

T  14 
J— 13 

3—10 

3-10 

T  14 

J— H 

3  —  11 

3-11 

T     1  C 

J— 1  5 

4  11 
3  —  1  2 

3-12 

M  —  X  O 

3  — X  3 

3-13 

J  — 1  / 

3    0  9 
«  — U  J 

T       1  1 

J-12 

2-0  3 

J  —  X  O 

3  n  A 
•— u  % 

V  14 
J— 1  3 

2  —  0  4 

M  —  X  ^ 

3-0  ^ 

J— 14 

1  AC 

2  —  0  5 

M  —  *  u 

i  —  30 

A      1  A 

4  —  20 

7—3  1 
u  —  *  X 

i  —  3  1 
^  —  •  X 

A  11 

4  —  21 

J— 2  2 

7—1  * 
»l  —  X  D 

J-23 

7—1  K 
U  — X  O 

M  —  *  ^ 

T  in 

J— X  7 

ff  —  0  0 

m— U  If 

3  11 
«  — X  X 

K— 0  9 

2-11 

ff— 1  n 

«  — X  u 

3  —  13 
•  — X  « 

W      1  A 

K— 1 0 

2-12 

K— 1  1 
«—  X  X 

3  —  1  9 
«  —  X  J 

w  11 

K— 1 1 

1  14 

2  —  13 

n— X  « 

J  — O  ^ 
f  — U  9 

W  11 

K— 1 2 

4  —  0  5 

K— 1  3 

i  — OK 

V  19 

R— X  3 

A    n  £ 
f  — H  O 

K-14 

2-14 

K-14 

2-14 

K-15 

2-15 

K-15 

2-15 

K-16 

2-16 

K-16 

2-16 

K-17 

K-17 

L-22 

1.-23 

2-17 

L-27 

2-17 

L-24 

L-28 

L-29 

L-25 

L-30 

L-26 

L-31 

L-27 

L-32 

M-0  5 

M-0  5 

M-0  6 

M-06 

686 


n     HAEP  ID 


DESCRIPTION 


Table  A(l) 
Rtading  It«»s  and  Locations 

Qradt  4/Ag«  9 


Grade  8/Age  13 


Grade  11/Age  17 


11002903 
11002904 
H002905 
S002906 
11003001 
V003002 
11003003 

53.  11003101 

54.  11003102 
11003103 
9003201 
11003202 
9003203 
9003204 
9003301 
9003401 
9003501 
9003601 
9003602 
9003701 

66.  9003702 

67.  9003703 
9003601 
9003602 
9003603 
9003901 
9004091 
9004002 
9004101 
9004201 

76.  9004202 

77.  9004301 
9004302 
9004401 
9004402 
9004403 
9004501 
9004502 
9004601 
9004602 
9004603 
9004604 
9004701 
9004702 
9004703 


46. 
47. 
46. 

49. 
50. 

51< 
52< 


55. 
56. 
57. 
56. 
59. 
60. 
61. 
62. 
63. 
64. 
65. 


66. 
69. 
70. 
71. 
72. 
73. 
74. 
75, 


76. 
79. 
80. 
61. 
62. 
63. 
64. 
65. 
66. 
67. 
66. 
69. 
90. 


SOCCER  :III90  ED  irA9TED  TO  OUTLAW-PRACTICE  ARCHERY 
SOCCER; CALLED  F0REI09-INNI(iRA9TS  PLATED  IT  HOST 
SOCCER :I9TR0  TO  E9GLISH  BT  R0NA9S 

SOCCER: PELS  MASTER-FOOLED  0PP09E9TS   BT  FAKE  MOVES 

SUPR  COURT: C0RSTITUTI09  DESCRIPTI09-BRIEF 

StIPR  COURT: DIFFICULT  PESP09  FOR  COURT  MEMBERS 

SUPR  COURT; *THEIR"  REFERS  TO  PR0VISI09S 

GOODS:    DIFF  TO  MARKET-ROADS  POOR 

GOODS;    TA9KEE  PBDDLER-TODAT  SALESPERS09 

GOODS:   COMPARE  TRADI9G  A9D  SELL29G  19  1700  VS.  ROW 

SUMMER  JOB;SOC  SECURITY  APPLIC  AT  BA9K  OR  POST  OFC 

SUMMER  JOB; BEST  TIME  TO  FI9D  JOB-BEFORE  MID-APRIL 

SUMMER  J0B:9BED  SS  CARD  TO  GET  HIRED 

SUMMER  J0B:REFERE9CES-PE0PLB  WHO  K90W  APPLICANT 

BOBBY: SAYS  TALL  IS  SMART 

YOUNG  GARDENERS; IN  CENTRAL  PARK-BEST 

TOASTER; DRAGON/TOASTER  QUALITIES  COMPARED 

MAGIC  TRICK;  FIRST  TIE   B*.ACK  THREAD 

MAGIC  TRICK; DIMLY  LIT  RM,    SAY  PRODUCE  FROM  AIR 

WEB  LIFE:  THREAD  BREAKS-FALLS  APART 

WEB  LXFE:   maim  IDEA-PL9TSftA9MS  MEED  EACH  OTHER 

WEB  LIFE;  WHY  YOU  CHOSE  A  PARTICULAR  MAIM  IDEA 

SCOTT: BEST  TITLE-SCOTT'S  PLA9 

SCOTT; 6  WEEKS  BETWEE9  DEPOTS 

SCOTT: CACHE-PLACE  FOR  ST0RI9G  THIHGS 

SELFISH  PERS09;DESCRIPTI09  19  PASSAGE 

TRIA9GLE; FIGURE  DRAW9 

TRIA9GLE;9AME  FIGURE   AS  TRIA9GLE 

909SE9SE  WORD  1;KAG-FIRE 

ME0W--W0W;2   M09TH  KITTE9-FEED  3  OR   4  TMS  DAILY 
MEOW«WOW;CAT  LEAVES   FOOD-LEAVE   BOWL  FOR  RIM 
JAVELI9;MAI9  REAS09 

JAVELX9;BZPLA9ATI09  OF  AUTHORS  IMPRESSSIOM 

9A0MI   JAMES;HOW  L09G  09  SAILI9G  TRIP-   272  DAYS 

9A0MI   JAMES ;IMP0RTA9CE  OF   TRIP-BROKE  WORLD  PECORD 

9A0MI   JAMES ;W0R8T  PART  OF  TRIP-  BAD  STORM 

AREA  C0DES:I9F0  9Y-1-212-555-1212 

AREA  CODES: SYRACUSE  1-315-255-6011 

JOBS   1900;   MARTHA  THI9K-J0B  TIRESOME 

JOBS   1900:    JOE  F0U9D  HARD-STAYI9G  19  WOODS 

JOBS   1900;    JOB  AT  HOME-ADDIE 

JOBS   1900:   MOW  WERE  THE  LIVES  OF  THE  4  DIFFERE9T 
CARRIER  AD: IF  I9TEReST  ft  MEET  REQRM9TS-CALL  CIRC 
CARRIER  AD: 6  YR  OLDS   TOO  Y0U9G  FOR  JOB 
CARRIER  AD:MUST  DELIVER  PAPERS  BY  7  EACH  AM 


Block 

Tape 

Block 

Tape 

Block 

Tape 

M-07 

M-07 

H-0  8 

M-08 

M-09 

M-09 

M-10 

M-10 

M-10 

3-14 

M-11 

3-15 

M-11 

3-15 

M-11 

3-15 

M-12 

3-16 

M-12 

3-16 

M-12 

3-16 

M-13 

3-17 

M-13 

3-17 

M-14 

M-14 

H-14 

M-15 

M-15 

H-15 

M-16 

M-16 

M-16 

R-12 

1-07 

R-21 

1-07 

9-13 

1-08 

9-22 

1-08 

N-14 

1-09 

N-23 

1-09 

N-15 

1-10 

N-24 

1-10 

9-16 

3-19 

N-25 

3-19 

9-17 

R-18 

2-10 

N-27 

2-10 

N-19 

1-13 

N-28 

1-13 

N-20 

1-14 

N-29 

1-14 

N-23 

N-21 

N-30 

N-24 

N-22 

N-31 

N-25 

N-23 

N-32 

0-12 

4-02 

0-12 

4-02 

0-12 

4-02 

0-13 

4-03 

0-13 

4-03 

0-13 

4-  03 

0-14 

4-04 

0-14 

4-04 

0-14 

4-04 

0-16 

3-14 

3-14 

0-15 

0-15 

0-17 

4-16 

0-17 

4-22 

4-22 

0-18 

4-12 

0-18 

4-18 

0-21 

4-18 

0-19 

4-13 

0-19 

4-19 

0-22 

4-15^ 

O-20 

0-2  3 

0-21 

0-24 

P-07 

4-09 

P-0  7 

4-15 

4-15 

P-08 

4-10 

P-0  8 

4-16 

4-16 

P-09 

4-11 

P-09 

4-17 

4-17 

P-10 

P-20 

P-11 

P-21 

P-12 

P-22 

P-13 

P-23 

P-14 

P-2  4 

P-15 

P-25 

Q-10 

1-15 

Q-07 

1-15 

1-15 

Q-11 

1-16 

Q-08 

1-16 

1-16 

Q-12 

1-17 

Q-09 

1-17 

1-17 

648 


6B8 


Tabl«  A(l) 
Rtading  Xt«»s  and  Locations 


II     RABP  ZD 


91. 

f2. 

f3. 

94. 

95. 

9C. 

97, 

91. 

f9. 
100. 
101 . 
102. 
103. 
104. 
105. 
10(. 
107. 

lot. 

109. 
110. 
111. 
112. 
113. 
114. 
115. 

lis. 

117. 
lit. 
119. 
120. 
121 . 
122. 
123. 
124. 
125. 
126. 
127. 
128. 
129. 
130. 
131. 
132. 
133. 
134. 
135. 


1I004I01 
11004901 
1IC05001 
H005002 
ltOU:»003 
HOOSlOl 
11005201 
II005202 
II005203 
II005301 
R005302 
H005303 
II005304 
N005305 
11005401 
II005402 
II005403 
11005404 
11005405 
lf00540( 
11005407 
9005501 
11005502 
11005503 
11005504 
11005505 
11005(01 
11005(02 
11005(03 
11005701 
11005702 
11005703 
9005101 
11005901 
9005902 
900(001 
900(002 
900(003 
9006101 
900(201 
900(202 
9006203 
9006204 
9006205 
9006301 


DBSCltXPTX09 


8ZLRT   3:irZSRCD  BE  RAD  SONE  RAZlt 
COLORADO :QOLD  DISCOVERT  DOESM'T  BCLC9Q 
AltTS:BBPOltB  1940  ABTS  WEBB  0RZR9T' C  fO  ELITE 

abts:pbivilbqb  or  abistockatic  pew-oreat  works 

ARTS:  MASS   PROD  90  R.^RN  TO  QE9UI9B  ART 
DBAiri9Q:iri9RIB  SROBTBB  TRA9  PANELA-BEST  STATENE9T 
TBAPPIC: APPEAR  19  COURT  TO  PLEAD  ROT  QUILTT 
TRAPPIC:PIRB-$3.00 

TBAPPIC: PAT  PIRB   BT  TRUBS ,   JURE  11 

SEALS:   QET  POOD  OR  SROBE-PBON  TREIB  AT 

SEALS:   SUBPBISE   IR  NEXICO-TROUQHT  SEALS  CXTZRCT 

SEAr.S:  NAI9  PUBPOSE-DESCBIBE  SEALS 

SEALS:   CONE  SROBE  TEABLT-BIRTH  TO  TOURQ 

SEALS t   BLUBBEB  NEARIRQ-PAT 

»ERO:    I9TERBSTi9Q  ARTICLE 

REBO:   EASE  OP  BBADI9Q  ABTICLE 

RBBO:   HAIR  IDEA-SIN09  WAS  A  GREAT  RERO 

RBRO:    PRON  WRXT  COURTRT-VEREEUELA 

RERO:    TRUE-COLOMBIA  ORCE  SPARISH 

RERO:   NORET  CALLED  'BOLIVARS' 

REROt    GOAL  REVER  REACRED-COURTRlES   JOIR  TOGETHER 
BUSIRESS:  IRTEBESTIRG 
8USIRESS:    EASE  Of  BEADIRG 

BUSIRESS:    NAIR  .-URPOSE-BUS IRESS  TERMS  MEAR 
BUSIRESS:   OWE  50   DOLLARS  POR  BIRE-A  LIABILITY 
BUSIRESS t    EXTRA  NORET   IS  PROPIT 
TREES:    TRAP'i  POLLUTARTS-LEAVES 
TREES:   CLE^RIRO  TRE  AIR-PILTERIRG  PARTICLES 
TREES:    PURPOS  OREER  BELT-BBDUCE  CITT  POLLUTIOR 
GBAPR:NOST  POWER   1 91 0 , 1 9 • 5 , 20 0 0-PETROLEUN 
GRAPR:IR  3000,RTDROPOWER  SUPPLT  LESS  TRAR  COAL 
GRAPR:IR  2000  RUCLEAR  POWER  NORE  %  TOTAL  TRAR  1971 
ERQLISR  Die : BOOK  TELLS  WOBD  NEARIRGS-DICTIORART 
CARDCAT:CALL  RUNBEB-WBITE-IR  (29.1  082 
CARDCAT: PICTURES   IRDIC  BT  "ILLUS" 

PRORE  DIB:STOBES  SELL  NILE  LISTED  URDER  DAIRIES 
PRORE  DIR:RERDRICKS  NIRIRG  OR  (3RD  ST,  443-1502 
PRORE  DIB:STAR  TRACKER  OPBR  TO  REPAIR  NICROSCOPE 
WIND  STNB0LS:P0B   35  KROTS-STNBOL  3 
INDEX :PIRD  KIRG  DABIUS  IRPO  OR  PG   2  3 
IRDEXrPIRD  CUREIPOBN  PBONUNCIATION 
INDEX: 1175  PBERCH  CONSTITUTION  IRPO  OR  PG  233 
IRDEX:ALTEBRATE  RDG./DUTCR  EAST  IRDIES-IRDORESIA 
IRDEX:DISARNANENT  IR   EASTERR  EUROPE  IRPO  OH  PG  279 
CLOTHES  SIZES:SR0E  SISE  6-40-1 


Grad«  4/A9«  9 

Block  Tap* 

Q-i3  3-19 
Q-14  3-17 


Q-15  2-02 


S-19  1-18 


649 


Grad« 

8/A9«  13 

G  r  adtt 

ll/Ag* 

Block 

Tap* 

Block 

Taps 

Q-10 

3-20 

3-20 

Q-11 

3-18 

o-l  0 

3  —  1 8 

Q-1  3 

2-06 

Q^O  7 

*  — 1#  o 

Q-1  4 

2-0  7 

y- 1#  o 

*  — 1#  / 

0-1  S 

2-08 

y  —  w  » 

9 — n  A 
2  — u  • 

0-12 

2-0  2 

9  n  9 

Q-16 

4-2  3 

<  —  2  J 

O- 1  7 

4-2  4 

Q-1  2 

J   ■>  J 

%  —  2% 

Q-1  8 

4-2  5 

Q-1  J 

<  —  2  5 

Q-1  9 

Q-  20 

O— 2 1 
y—  ^  A 

0-2  2 

O-  2  3 

R-0  5 

R— 06 

R-0  7 

R-08 

R-09 

R-10 

R~  11 

R-12 

R^l  2 

R-13 

A-l  3 

R-14 

R-14 

R-IS 

K-15 

R-16 

R-16 

R-17 

R-18 

R-19 

S-19 

3-28 

S-19 

3-28 

S-20 

3-29 

S-20 

3-29 

S-21 

3-30 

S-21 

3-30 

S-22 

1-34 

S-22 

1-34 

S-23 

S-23 

S-24 

S-24 

S-2S 

S-2S 

S-2  6 

S-26 

S-27 

S-27 

S-30 

1-18 

S-30 

S-31 

1-29 

S-31 

S-32 

1-30 

S-32 

S-33 

1-31 

S-33 

S-34 

1-32 

S-34 

S-3S 

1-33 

S-3S 

S-36 

1-11 

S-36 

ERIC 


689 


690 


r 


«     BAIP  ZD 


DISClZfTZOa 


117. 

Hi. 

119. 
I49. 

141. 
142. 
141. 

145. 
I4<. 
147. 
I49. 
149. 
190. 
191. 
191. 
191. 
194. 
199. 
194. 
197. 
194. 
199. 
149. 
141. 
142. 
143. 
144. 

149. 

144. 

147. 

141. 

149. 

170. 

171. 

172. 

171. 

174. 

179. 

174. 

177. 

179. 

179. 

ISO. 


■004102 

■004401 

■004402 

■004901 

■004401 

■004402 

■004409 

■004404 

■004409 

■004701 

■004401 

■004402 

■004901 

■004902 

■004901 

■007001 

■007002 

■007001 

■007004 

■007101 

■007102 

■00710S 

■007104 

■007101 

■007102 

■007101 

■007104 

■007109 

■007104 

■007401 

■007402 

■007409 

■007404 

■007409 

■007404 

■007407 

■007901 

■007902 

■007909 

■007401 

■007402 

■007409 

■007404 

■004101 

■004102 


Tablt  X(l) 
■••din)  Xttas  aad  tecatient 

Qra4«  4/A9«  9 

Oleck  Tap« 


Qra4«  4/A9«  13 


Orad«  17 


CtO»l9  9X119 1  94  9lflATIl-44 

TIIT9I0I9T  rUiCI  TO  LOCATE  OULL  ■OrT-ZVDIX 
TlXTltOIlT  rtACl  flMD  DELTA  DEf !■ ./OEOO-OLOllAlT 
rZ^D  OOZDE:OrTZO^AL  OETIfSE^  OrrBEOl-ORACLE 
TAOLE  C0«TE9T9tN09T  OlEFOL  ZV  AMEIZCM  ■ZIT  C0UR9E 
TAOLE  C0«TE9T9IAIIE1ZCA«  ZHSrCrE^DE^CE  Z^  O^ZT  Z 
TAOLE  CO^TE^TltlECO^lTlOCTZOB  AfT  CZVZL  IIA1-WAP.4 
TAOLE  C0«TE»T9tHAJ01  TOfZC  «Ar .  17-^ArrE^Z^09  WIfZX 
TAOLE  C0«TE9T9tNZDDLE  EA9T  MAf , 1954-1970  0^  M.5f4 
iCZE^CE  Z^DEXlVOLVEl  FZllT  Z^  OOOK 
KArtirA^Zl^  Z^  lOOT^ 

KAPtrEOPLE  9ETTLED  Z^  ALA9KA-^0T  E^OUO^  Z^POIM 
■  EVllTir  9MEDOLE-rO  22 
■EWElWIATKEm  POlECAJT-rO  12 
■EV9I9T0CK  ATEtA0E9-r09  29-91 

CATALOO  CDtWlAT  Z^fO  0ZVE9  LOCATZO^-OV  445  C424 
CATALOG  CD: 90  POl  OT^El  900K9  9AME  TOrZC-221 
CATALOO  CDlAOtiOll  Of  OOOK-COOPEl  ft  IZEDE^TOP 
CATALOO  CDlOT^El  ■EADZ^O  TO  PZ^D  OOOK-IZEDE^TOP 
009  IC^EDlLAlT  009  Z^  ETE^Z^O  LEAVE  CZTADEL  4:45PM 
909  9CiEDl2^D  9AT  AH  009  AllZVE  DOWITOini  4ll5AN 
909  9CiEDlNZ99  2l99PN  P»ON  ■A«COCK  VAZT  TZLL  3:35 
909  IC^EDlLV  109TZC  WED  9l42Aj|  AllV  WUrW  10:15AH 
OmZDOElsKZ^D  OP  PEOPLE  «»E  HtU  NE^-POl  TlAPPEftl 
91ZDOE1I0E9T  DE9C1Z0E  ITOIZEI-ITIETC^ED  T^E  TlOTII 
OmZDOElSlZHZLS^PO^Dl  OP  NOD  OOZLZNO  LZKE  N09^ 
OlZDOElsmiO  DZ9COVE1E0  LA^D  ■Olf  TELLOlfETO^R-CoLTEl 
OlZDOEllliOT  NZ99ED  ELK  0ECAO9E  ELK  OOT  Of  ftAKOE 
9RZDOEKIKTPEK0OLE-  LAKE9  TKAT  ■AD  ■O  OOTTOH 
NENOKTl  NAZ^  lEAlO^  VBZTE-DE9CKZ0E  DETAIL!  OP  SOMN 
PKO^T  POKC^  Z^  HOMMEl-CONPOKTAOLE 
PAMZLT  LZPE  lf01D-CL09E-KKZT 
9TK0PT  NEAflZYO-^ONZD 
SET  OP  ilfZ^0-9^ZP9  9AZL  0^  OCSiUi 
DE9CKZ1E  NOOD 
.....w...   CKEATED  NOOD 

TKAVELllNAN  APtAZD-PEAlPOL  TN000KT9,^0  DA^OEl 
TBAVEL9IN00D  OP  AATZCLE 
T»AVEL9lDE9CKZ9E  NOOD  OP  AftTZCLE 
OASKETNAKEKimt  PEOPLE  Z  OECONE 

OASKETNAKEftlAOLE  PZKD  lENAZ^l  PEOPLE  ZZ-Olt  CAVE9 
OAlKETNAKEKlTftOE-PEOPLE  ZZZ  LZTE  ZK  LAftOEft  COHNON 
OAiKETNAKElS PEOPLE  ZZZ  09ED  PZTK009ES  POt  CEftEMOKT 
CLOlZKOtPO^-DOOMA^  AT  PLAIA  ■OTEL?  ■O 
CL09Z^0tP0^-P0ft  NOftE  TKAN  50  TEAK97  NO 


NENOKT I 
NENOKT I 
NENOKT I 
NENOKT  t 
NENOKT I 
NENOKT I 


T-24  1-19 


650 


lock 

Tap« 

Block 

Tapo 

9-37 

1-12 

9-37 

1-12 

9-24 

9-24 

9-29 

$-29 

T-24 

1-19 

T-24 

1-19 

T-19 

1-20 

T-19 

1-20 

T-20 

1-21 

T-20 

1-21 

T-21 

1-22 

T-21 

1-22 

T-22 

1-23 

T-22 

1-23 

T-23 

1-24 

T-23 

1-24 

T-27 

4-24 

T-27 

T-24 

T-24 

T-25 

T-25 

T-34 

T-34 

T-37 

T-37 

T-34 

T-3  4 

T-24 

1-25 

T-24 

1  _  9  ft 

T-29 

1-24 

T-29 

1  — 2  • 

T-30 

1-27 

T-30 

1  —  2  7 

T-31 

1-24 

T-31 

1  — 2i 

T-32 

3-22 

T-3  2 

J  —  2  2 

T-33 

3-23 

T-3  3 

J  —  2  J 

T-34 

3-24 

T—  9  % 

J  —  •  ^ 

T-35 

3-25 

T— 3  5 

J  —  2  9 

0-19 

1-01 

0-1  f 

A  —  II  A 

O-20 

1-0  2 

u—  *  o 

A  —  U  • 

0-21 

1-03 

0— J  1 

1 — n  1 

A  — 1#  J 

0-22 

1-04 

0-2  2 

1  — W  ^ 

0-23 

1-05 

w—  2  J 

A  —  W  3 

0-24 

1-04 

0-24 

1  — Of 

0-25 

0—25 

0-24 

0-24 

0-27 

0-27 

0-24 

0-24 

0-29 

0-29 

O-30 

O-30 

0-31 

0-31 

V-29 

V-34 

V-30 

V-39 

V-31 

V-40 

11-34 

lf-40 

11-39 

lf-41 

W-40 

1f-42 

11-41 

lf-4  3 

X-17 

4-07 

X-17 

4-07 

X-14 

4-04 

X-14 

4-04 

692 


Tabl«  A(l) 
Reading  ifmn  and  Locations 


NAIP  ZD 


DBSCRXPTIOlf 


CLosmatPuif-cifD  swinqxiiq  cakeer?  tes 

CLOSZlfO:PUlf-JOB  HAS  HELPED  HXM7  HO 
CLOSmO:  PUN-UNLOCK  SOME  SECRETS?  TES 
CLOSING :PUN-A  LOT  SINGES  ON  KINDNESS?  TES 
CLOSING :MAIN  PURPOSE-RBPT  SWEENET  LEAVES  JOB 
CLOSING  STONE  OP  CAPTION  IS   CLEVER  AND  WITTY 

cow-tail:  ogaloussa  was  killed  while  hunting 

COW-TAIL:   THEME-PERSON  NOT  DEAD  TILL  POROOTTEN 
cow-tail:   ogaloussa  is  wise » PAIR  FATHER 
COW-TAIL:   OGALOUSSA  SHAVED  HEAD-RETURNED  PROM  DEAD 
PULI  GOT  SWITCH-ASKED  ABT  FATHER  MISSING 
IS  THIS  A  GOOD  STORT? 


COW-TAIL: 

cow-tail: 


COW-TAIL :WRT  GOOD  STORT 


Grad«  4/A9«  9 
Block  Tap* 


NOOSCOl 

CRICKETS:   MAKE  SOUNDS  BT  RUBBING  WINGS 

n— W  D 

e  — 1 5 

N00S602 

CRICKETS:   WHICH  MAKE  CHIRPING  SOUNDS-ONLY  MALES 

■I  AT 

n— 0  / 

!-16 

N00S60S 

CRICKETS:  WHERE  ARE  EARS  -  IN  FRONT  LEGS 

ft  A  * 

n— H  o 

!-17 

N00S701 

PICTURE: DOG  LYING  ON  TOP  DOGHOUSE-BEST  DESCRIPTION 

H-09 

HA  AHA  1 

YVONNE'S  DOLL: COULDN'T  FIND-UNDER  PORCH 

J-18 

N00B901 

DOG: WRY   DOESNT  WANT-THINKS   DOGS  ARE  PESTS 

J-21 

N00I902 

DOG: CHILD  BRINGING  HOME  SNAKE 

J-22 

N00I903 

')OG:IS  THIS  A  GOOD  POEM 

J-23 

H00S904 

'OG:WHY  IS  THIS  A  GOOD  POEM 

J-24 

H009001 

FOLKS :WHO  ARE  THEY-HUMANS  WHO  LIVE  NEARBY 

K-12 

N009002 

FOLKS :GRAY  FOE  THINK-FOLKS  WERE  SENSIBLE 

K-13 

N009003 

FOLKS:   NAN  WAS  SITTING  ON  BENCH  IN  GARDEN 

K-14 

N009004 

FOLKS: DO  WREN  FOX  CAME  NEAR-MAN  WAS  POLITE 

K-15 

N009101 

NONSENSE  WORD  3 : HABBIES-DOGS 

K-16 

3 

-18 

N009201 

PUESLE  l:BiaD  DESCRIBED  IN  PUZZLE 

K-17 

3 

-28 

N009401 

DUAL:WORD  BAT-2  MEANINGS  FOOLED  NELL 

L-2  3 

N009601 

TIMOTHY   1:SXTTING  ON  STEPS 

L-21 

1 

-08 

N009701 

BOXBALLt    MASSACHUSETTS  TEACHER  INVENTED  BASKETBALL 

M-0  5 

R009702 

BOXBALL: PURPOSE  OF  ARTZCLE-HOW  BASKETBALL  INVENTED 

M-06 

N009703 

BOXBALL: TRUE-FOOTBALL  INVENTED  BEFORE  BASKETBALL 

M-0  7 

N009704 

BOXBALL:AT  FIRST  USED  PEACH  BASKET  FOR  GOALS 

M-OS 

N009705 

BOXBALL: BOTTOMS   CUT  OUT-TO  MAKE   IT  EASIER 

M-0  9 

H009801 

PUZZLE   3: CHAIR  DESCRIBED  IN  PUZZLE 

N-12 

N009901 

DESCRIPTION  3: PERSON  HAS  SEEN  TOY  MANY  TIMES 

N-13 

NOlOOOl 

DOG   &  SHADOW: LIKED  READING  IT 

N-17 

1 

-05 

N010002 

DOG   &  SHADOW:   SAW  HIMSELF  IN  THE  STREAM 

N-16 

1 

-05 

H010003 

DOG   &  SHADOW:   TEACHES   LESSON-GREED  DOESN'T  PAY 

N-19 

1 

-07 

HOIOIOI 

SANDWICH: LIKED  READING  IT 

N-20 

1 

-09 

H010102 

SANDWICH: NAMED  AFTER  PERSON  WHO  INVENTED  IT 

N-21 

1 

-10 

N010103 

SANDWICH: WANTED  MEAT   IN  BREAD  TO  EAT  AND  GAMBLE 

N-2  2 

1 

-11 

N010201 

DESCRIPTION  1:CL0WN  DESCRIBED  IN  PASSAGE 

0-16 

3 

-20 

N010301 

SNOWMAN: BEST  DESCRIPTION-SOMEONE  MADE  SNOWMAN 

0-15 

2 

-09 

651 


693 


Qrad*  8/A90  13 


Qrad«  11/A9«  17 


Block 

Tap* 

X-19 

4-09 

X-20 

4-10 

X-21 

4-11 

X-22 

4-12 

X-23 

4-13 

X-24 

4-14 

Y-04 

3-01 

Y-05 

3-02 

Y-06 

3-03 

Y-07 

3-04 

Y-08 

3-05 

Y-09 

3-06 

Y-10 

3-07 

Block  Tap« 

X-19  4-09 

X-20  4-10 

X-21  4-11 

X-22  4-12 

X-23  4-13 

X-24  4-14 

Y-06  3-01 

Y-07  3-02 

Y-08  3-03 

Y-09  3-04 

Y-10  3-05 

Y-11  3-06 

Y-12  3-07 


694 


T«bl« 
Reading  Xt«M8 


n 

• 

VAEP  ID 

226. 

K010401 

227. 

n010402 

228. 

V010403 

229. 

■010501 

230. 

H010502 

231. 

■010503 

232. 

■010504 

233. 

■010601 

234. 

■010602 

235. 

■010603 

236. 

■01C604 

237. 

■010605 

231. 

■010701 

239. 

■010801 

240. 

■010901 

241. 

■010902 

242. 

■910903 

243. 

■010904 

244. 

■011001 

245. 

■011002 

246. 

■011003 

247. 

■011004 

248. 

■011101 

249. 

■011201 

250. 

■011301 

251. 

■011302 

252. 

■011401 

253. 

■011402 

254. 

■011403 

255. 

■011404 

256. 

■•11501 

257. 

■mi601 

258 . 

■011602 

259. 

■011603 

260. 

■011604 

261. 

■011605 

262. 

■011701 

263. 

■011801 

264. 

■011901 

265. 

■011902 

266. 

■011903 

267. 

■011904 

268. 

■012001 

269. 

■012101 

270. 

■012201 

DESCRIPTION 


TOOTH  TROUBLE:  SPEAKER-CHILD 

TOOTH  trouble:  true-pulled  loose  tooth 
tooth  trouble:  "hot  me,  wo^t  prdce  it"  same-no  gro 
quicksand: how  test  for  it-poke  with  a  stick 
quicksahd:hai^  purpose-to  tell  wats  avoid  danger 
quicksand: it  is  soupt  sand  tou  ca^'t  stand  on 
quicksandzip  step  in, lie  on  back  &  stretch  out  arm 
thad: candidates  for  pres  not  allowed  oive  oifts 
thad:maooie  thought  thad  good  but  need  her  help 
thad:massive  stampede-lot  op  people  rushing 
trad: exaggerated-can  do  everything  in  tellow  pages 
trad: maggie  first  helped  thad  with  speech 
sentence  3:m0st  semse-ball  rolled  down  the  street 
angrt:  child  comes    out  when  feels  better 
stars  unseen:  liked  reading  it 
stars  unseensstar  becomes  dead  bt  usi^g  up  fuel 
stars  u^see^smaim  idea-stars  exist-we  ca^'t  see 
stars  u^sbe^:gravitt  of  dead  stars-push  &  pull 
reporter:  who  was  erhie  ptle-^ewspaper  reporter 
reporter:row  ftles  writiwg  chahge-troop  hvm^t&gnrl 
reporterrhappehed  to  ptle-famoms  reporter 
reporter :wht  ptle  chahge  ■ews-remembed  death  soldi 
kihd  of  bk: atmosphere  from  sciehce  book 
dogs'qual:bitte^  bt  dog,  disagree 
sku^k  cabbage :^ane-smells  like  sku^k, looks  cabbage 
sku^k  cabbage :hard  to  see-hiddeh  u^der  rood 
breathirg: true-blood  moves  oxtge^ 
breathirg:  how  air  moves  to  luhgs-thr0u6h  windpipe 
breathing: function  of  air  sacs  i^  lungs-02  from  lu 
breathihg:c02  form  ir  bodt-cells  form  c02  waste 
dictiorart:to  fird  word  mea^i^g-dictionart  best 
dictionary: definition  tome-a  large  book 
dictionary :tomorrow  stllabicated-to  nor  row 
dictionary: plural  is  tonsillectomies 
dictionary: tolerance  is  a  noun 
dictionary  :tonic-*<akes  you  feel  berier 
which  word  comes  first  in  dictionary-  flea 
encyclopedias  2:washingt0n  in  vol  11 
index:find  out  about  salmon-pgs  84  (85 
index: alternate  info ; railrds-travel  &  transport 
index: find  hap  of  snare  river 'pg  84 
index:fxnd  map  s.  american  rain  forests-pg.  119 
declaration  of  independence:  best  info  encyclopedi 
code:wrat  does  rppe  actually  spell-good 
dictionary: plume  is  feather 


Er|q5 


Ad) 
and  Locations 

Grada  4/A9«  9  Grada  8/A9a  13 


Ta  pa 

0— 20 

U—  A  L 

0—  2  2 

p— 1  n 

Z—  U  J 

D.I  1 
If— 1  1 

z  — u  a 

If— 1  £ 

z— U  9 

If  —  A  J 

z— u  o 

F— 1  a 

If— 1  9 

If  —  A  O 

It  -  a  / 

P.I  a 

It  -  a  O 

P-19 

Q~16 

3-29 

0-17 

Q~l  8 

o— 19 

Q— 20 

R-05 

R— 06 

R— 07 

R— 08 

R-09 

1-36 

R-10 

2-18 

R-11 

3-21 

R-12 

3-2  2 

R-13 

R-14 

R-15 

R-16 

T-27 

S-21 

3-23 

S-22 

3-24 

S-23 

3-25 

S-24 

3-26 

S-25 

3-27 

S-30 

4-18 

S-31 

S-26 

1-24 

S-27 

1-25 

S-28 

1-26 

S-29 

1-27 

S-'2 

5-33 

T-19 

1-28 

Grada  ll/Aga  17 
Block  Tapa 


652 


6BB 


Tabl« 
Rtading  zt«as 


ID  DESCRIPTION 


DICTIOIIART:IIORE  THAN  1   PLOWMAN  IS  PLOWNEN 
DI CTIONART : PLUNDER-ROB 
DICTIONARY :PLUN-INP0RTANT  WORK 
NUSHROON:   3   PARTS--CAP,    STEM,  OILLS 
INDEX: ALPHA  LIST  OF  TOPICS  AND  PAOE  NUMBERS 
WHALE  FOOD:    INFO  FOUND  IN  ENCYCLOPEDIA 
R0T0R:BBST  PLACE  FIND  INFO-DICTIONARY  ROTOR 
ENCYCLOPEDIA: INFO  ON  MEXICO  IN  VOLUME  6 
BNCYLOPBDIA:INFO  ON  INVENTIONS  OF  EDISON  IN  VOL  3 
BNCYLOPBDIA:INFO  ON  IOWA  FARM  PRODUCTS   IN  VOL,  5 
BHCYCLOPBDIArlNFO  ON  N.Y.RIVERS   fc  LAKES   IN  VOL,  7 
ORAPH: SPENT  MOST  ON  A  BOOK 
ORAPR: RECORD  COST  $2,50 

ORAPH:    5  ITEMS  COST  MORE  THAN  PAINTBRUSH 

GRAPH :SPENT  SAME  AMOUNT  ON  PAINTS, BIKE  PARTS 

TIMOTHY: 3  TBBNAOERS  TALKINO  ABOUT  HEAT 

OIL  SPILL:  WHAT  IS  THE  OBOROIA-A  SHIP 

OIL  SPILL:   WHERE  WAS  SPILL-5  MILES  FROM  BEACH 

OIL  SPILL:WHY  LOOS  NO  STOP  OIL-HIGH  WAVES 

OIL  SPILL:   WHAT  ARE  PEOPLE  ASKED  TO  DO-CLEAN  BEACH 

THE   COLO: BOY  LEFT  .*HADOW-FROBE  TO  SIDE  OF  HOUSE 

THE  C0L0:GIRL5  FIGHT  WITH  MELTED  WORDS 

THE  COLO: DUCKS  FLY  AWAY  WITH  POND 

THE  COLO: WRITER  MAKES  STORY  SOUND  PLAYFUL  &  FUNNY 
BULLFIGHT:BULL  CHARGES  CAPE  MOTION 

DESCRIPTION  2:UNHAPPY  PERSON  DESCRIBED  IN  PASSAGE 
FROM  THE  PLANET :B0TCHIK  FELT  ANNOYED  AND  UPSET 
FROM  THE  PLANET: THOUGHT  NO  LIFE-THICK  CLOUD  COVER 

PtANBT:IN  GLASS  CAGE  WAS  A  HUMAN  BEING 
rSrSr'S^JS  ^®  "   REPAINTED        f  GONE 

«5J2«T!Ii,^,!II*'^^'^~"  SECRET  WAY  TO  MARK  BIKE 

fwT^-J!®^"^*-'"^'"  "  IA2Y  AND  RUDE 

SWINGING/5TAR:PE0PLE  SHOULD  DIFFER--TRY  BE  BETTER 
SWINGING/STAR:   LINE  4   DOESN'T  RHYME  WITH  OTHERS 
OLD  man: STORY  TELLS  HOW  NAN  LOOKS 

SAVING  ENERGY:   MAIN  IDEA-CONSERVE  OIL   &  NAT  GAS 

SAVING  ENERGY:   SOURCE  MOST  ENERGY-OIL  &  NAT  GAS 

mi^t  '^AT  CAN  SOLAR  ENERGY  PROVIDE-HEAT 

NONSENSE  WORD  3:TUP-PAPER 

SENTENCE   1:M0ST  SENSE-BLEW  HOUSE  DOWN 

TIMOTHY  2:TEENAGERS  STANDING  IN  CIRCLES 

WOMEN:    BEST  DESCRIBES  WOMEN-WORKED  HARD 
FRONTIER  women:   ACTIVITIES   PERFORMED-MAKE  T00LS4PL 
FRONTIER  WOMEN:   MADE  FROM  ANML  HORNS/BONES-TOOLS 
CONNECT  DOTS; ALONG  LINE, CONNECT  DOTS 


A(l) 
and  Locations 

Grada  4/Aga  9  Grade  8/Aga  13 


Block 

Tap  a 

T— 20 

1  — •  3f 

f  — «  1 

1     9  n 

1-30 

T— «  « 

1-3 1 

S— 20 

1-12 

T— 

T  — «  5 

*  — •  4 

1  —  13 

f  —  «  O 

1      9  9 

1-3  2 

*— •  y 

1      9  9 

1-33 

*  —  JU 

1      9  il 

1  —  4  4 

T— <  1 
A  —4  1 

1    9  e 
1  —  4  5 

T— ^9 
*    *  • 

1  _9n 
1  — •  U 

1  —  *  * 

1  91 
1  — •  1 

1^9  9 
1  — 2« 

T-3  5 

1  — •  4 

U-19 

U— 20 

U-21 

U-22 

U-23 

U-24 

1-01 

U-2S 

1-0  9 
A  — u  « 

U-26 

1-03 

U-27 

1-04 

V-29 

4-17 

V-30 

1  —  14 

V-31 

V-33 

W-37 

4—07 

W-38 

4-0  8 

W-40 

W-41 

w-4a 

X-17 

X-18 

X-19 

X-20 

M-\3 

2-14 

Q-21 

V-34 

N-14 

N-IS 

N-16 

V-35 

Grada  ll/Aga  17 
Block  Tapa 


653 


698 


Tabl«  Ad) 
Reading  Xt««s  and  Locations 

R     NABP  ZD  DE8CRZPTZ01I  Grad»  4/Ag»  9  Grad«  8/Ag»  13  6rad«  ll/Ag»  17 


Block     Tap*  Block     Tap*  Block  Tap* 

11014502  CORRECT  DOTS:DRAW  LZRE  TO  TOUCH  CZRCLES  V-35 

R014503  CORRECT  DOTS:WRZTE  3  ZR  EACH  CZRCLE  V-35 

R015101  BLACK  ELK:  TRZRR  WASZCHU8  WERE  QREEDT 

■015102  BLACK  ELK:  WHO  WERE  THE  WASZCHUS  R-18 

9015103  BLACK  ELKrDRZRRS  WATERS  DREAM  PREDZCT  R-19 

R015104  BLACK  ELK:  MAZR  PURPOSE  OP  STORT 

■015201  PEOPLE  LEARR  TO  READ:   ZR  SCHOOL  ^'^^ 

R015501  CHAMORZX:   LZKED  RBADZRO  ZT 

R015502  CHAMORZX:  WHY  SO  LORO  TO  RBACB-WZRDS  TOO  STRORQ  P-16 

R015503  CHAMORZX:   DEVOUASSOU-MAR  WHO  POURD  CLZMBERS  P-1'7 

R015504  CHAMORZX:   DESMAZSOR  SURVZVE  BT  MBRTAL/PH7S  STRNOTH  P-IS 

RC15505  CHAMORZX:  WHT  DESMAZSOR  CRT-OVERCOME  SUPPERZRQ *  JO?  P-19 

R015f01  HZOH  TECH  PZZZA:  WHT  PZZZA  Q-^^ 

R015f02  HZOH  TECH  PZZZA:    ZRTERMEDZATE  STAQE  Q-15 

R015f03  HZOH  TECH  PZZZA:   CORR8TARCH  USED 

H015904  HZOH  TECH  PZZZA:   DESCRZBB  PABRZCATZOR  OP  PZZZA  Q-17 

R016001  VOTZRO:   MAZR  PURPOSE  O-^S 

R016002  VOTZRO:   MEARZRO  OP  SUPPRAOE  ^-^^ 

R016003  VOTZRO:   PZRST  CORORESSWOMER  0-17 

R016004  VOTZRO:   DZSASTER  AT  TRZAROLB  SHZRTWAZST  0-18 

R016005  VOTZRO:   ROSE  SCHREZDERMAR  SAT 

R016006  VOTZRO:  WW  Z  HELPED  SUPPRAOZST  CAUSE  O-20 

R017001  THE  CHZP:   MAZR  ZDBA 

R017002  THE  CHZP:  WZDESPREAD  RESULT 

R017003  THE  CHZP:   MEARZRO  OP  TRZPLZHO 


o  S99 

ERIC 


654 


Table  A(2) 
Background  and  Attitude  Items  by  Topic 


General  Background 


Demographic  background  and  home  environment 


Ethnicity 

BOOOlOl 
B000102 
B000201 
B0O0202 


ETHNICITY 
OTHER  ETHNICITY 
ARE  YOU  HISPANIC 
OTHER  SPANISH-HISPANIC 


Language  background 

B000301  WHAT  LANGUAGE  DO  YOU  SPEAK  MOST  OFTFM  IN  HOME 

B000302  OTHER  LANGUAGE  YOU  SPEAK  MOST  OFTEN  IN  HOME 

B000401  WHAT  LANGUAGE  DO  OTHERS  SPEAK  MOST  OFTEN  IN  HOME 

B000402  OTHER  LANGUAGE  OTHERS  SPEAK  MOST  OFTEN  IN  HOME 

B000501  FIRST  OTHER  LANGUAGE  YOU  KNOW 

B000502  SECOND  OTHER  LANGUAGE  YOU  KNOW 

B000503  THIRD  OTHER  LANGUAGE  YOU  KNOW 

Mother  work  outside  home 

B000801    DOES  YOUR  MOTHER  WORK  OUTSIDE  YOUR  HOME 

Parents'  education 

B000601  HOW  FAR  IN  SCHOOL  DID  YOUR  FATHER  GO 
B000701    HOW  FAR  IN  SCHOOL  DID  YOUR  MOTHER  GO 


Objects  In  home 


B000901 
B000902 
B000903 
B000904 
B000905 
B000906 
B000907 

Mobility 

B002001 
S002801 
S002802 
S002803 
S005901 


DOES  YOUR  FAMILY  GET  A  NEWSPAPER  REGULARLY 
IS  THERE  A  DICTIONARY  IN  YOUR  HOME 
IS  THERE  AN  ENCYCLOPEDIA  IN  YOUR  HOME 
ARE  THERE  MORE  THAN  25  BOOKS  IN  YOUR  HOME 
DOES  YOUR  FAMILY  GET  MAGAZINES  REGULARLY 
IS  THERE  A  VIDEO  GAME  IN  YOU  HOME 
IS  THERE  A  COMPUTER  IN  YOUR  HOME 


HOW  MAY  DIFFERENT  TOWNS  HAVE  YOU  LIVED  IN 

WHERE  DID  YOU  LIVE  AT  AGE  9 

STATE 

COUNTRY 

WHERE  DID  YOU  LIVE  AT  AGE  13 


655 


701 


Table  A(2) 
(continued) 


Who  is  home  after  school 

5003201  WHAT  DO  YOU  USUALLY  DO  AFTER  SCHOOL 

5003202  IF  YOU  GO  HOME  AFTER  SCHOOL,  WHO  IS  USUALLY  THERE 

5003203  WHAT  OTHER  ADULT 

Faoiily  composition 

5003901  HOW  MANY  OLDER  BROTHERS  AND  SISTERS 

5003902  HOW  MANY  YOUNGER  BROTHERS  AND  SISTERS 


Educational  background  and  plans 

School  program 

BOOlOOl  DO  YOU  HAVE  GYM  ONCE  PER  WEEK 

B001002  DO  YOU  HAVE  ART  ONCE  PER  WEEK 

B001003  DO  Yrj  HAVE  MUSIC  ONCE  PER  WEEK 

B001004  DO  YOU  HAVE  FOREIGN  LANGUAGE  ONCE  PER  WEEK 

B001005  DO  YOU  HAVE  COMPUTER  CLASS  ONCE  PER  WEEK 

B001006  DO  YOU  HAVE  DRAMA  CUSS  ONCE  PER  WEEK 

B001007  DO  YOU  HAVE  SCIENCE  ONCE  PER  WEEK 


B001901    WHICH  DESCRIBES  YOUR  GRADES  IN  SCHOOL 

Preschool  experience 

5002701  DID  YOU  GO  TO  KINDERGARDEN 

5002702  DID  YOU  GO  TO  DAY  CARE 

5002703  DID  YOU  GO  TO  NURSERY  SCHOOL 

5002704  DID  YOU  GO  TO  HEADSTART 

Educational  expectations 

S003401    DO  YOU  EXPECT  TO  GRADUATE  FROM  HIGH  SCHOOL 

Applied  to  college 

S005701    HAVE  YOU  APPLIED  TO  COLLEGE 


Career  goals 

S005801    WHAT  ARE  LONG-TERM  CAREER  GOALS 


High  school  program 

Science  courses  taken 


Grades 


5006001  HAVE  YOU 

5006002  HAVE  YOU 

5006003  HAVE  YOU 

5006004  HAVE  YOU 


TAKEN  GENERAL  SCIENCE 
TAKEN  BIOLOGY 
TAKEN  CHEMISTRY 
TAKEN  PHYSICS 


656 


702 


Table  A(2) 
(continued) 


S006005    WHAT  OTHER  SCIENCE  COURSES 


Math  courses 

taken 

S006101 

HAVE 

S006102 

HAVE 

S006103 

HAVE 

S006104 

HAVE 

S006105 

HAVE 

S006106 

RAVE 

S006107 

VBAT 

S006108 

VHAT 

S006109 

WHAT 

TOU  TAKEN  GENERAL  HATH  1 
TOU  TAKEN  GENERAL  HATH  2 
YOU  TAKEN  FIRST-YEAR  ALGEBRA 
YOU  TAKEN  SECOND- YEAR  ALGEBRA 
YOU  TAKEN  GEOHETRY 
YOU  TAKEN  CALCULUS 
OTHER  HATH  COURSES-1 
OTHER  HATH  COURSES-2 
OTHER  HATH  COURSES-3 


Special  courses  taken 

5006401  EVER  HAD  REHEDIAL  ENGLISH 

5006402  EVER  HAD  REHEDIAL  HATHEHATICS 

5006403  EVER  HAD  HONORS  ENGLISH 

5006404  EVER  HAD  HONORS  HATH 

5006405  EVER  HAD  HONORS  SCIENCE 

5006406  EVER  HAD  BILINGUAL  PROGRAH 

5006407  EVER  HAD  FAHILY-LIFE  OR  SEX  EDUCATION 

5006408  EVER  HAD  ALCOHOL  OR  DRUG-ABUSE  EDUCATION 

5006409  EVER  HAD  PROGRAH  BECAUSE  OF  PHYSICAL  PROBLEH 

5006410  EVER  HAD  PROGRAH  BECAUSE  OF  SPEECH  PROBLEH 

5006501  HAVE  YOU  TAKEN  AGRICULTURE 

5006502  HAVE  YOU  TAKEN  AUTO  HECHANICS 

5006503  HAVE  YOU  TAKEN  COHHERCIAL  ARTS 

5006504  HAVE  YOU  TAKEN  COHPUTER  PROGRAHHING 

5006505  HAVE  YOU  TAKEN  CARPENTRY 

5006506  HAVE  YOU  TAKEN  ELECTRICAL  CONSTRUCTION 

5006507  HAVE  YOU  TAKEN  HASONRY 

5006508  HAVE  YOU  TAKEN  PLUHBING 

5006509  HAVE  YOU  TAKEN  COSHETOLOGY 

5006510  HAVE  YOU  TAKEN  DRAFTING 

5006511  HAVE  YOU  TAKEN  ELECTRONICS 

5006512  HAVE  YOU  lAKEN  HOME  ECONOHICS 

5006513  HAVE  YOU  TAKEN  HACHINE  SHOP 

5006514  HAVE  YOU  TAKEN  HEDICAL  OR  DENTAL  ASSIST 

5006515  HAVE  YOU  TAKEN  PRACTICAL  NURSE 

5006516  HAVE  YOU  TAKEN  FOOD  SERVICE 

5006517  HAVE  YOU  TAKEN  SALES  OR  HERCHANDISING 

5006518  HAVE  YOU  TAKEN  SECRETARIAL 

5006519  HAVE  YOU  TAKEN  WELDING 

5006520  VHAT  OTHER  COURSES  HAVE  YOU  TAKEN 


657 


703 


r 


Table  A(2) 
(continued) 


Plans  after  high  school 

S006601    WHAT  ONE  THING  WILL  YOU  DO  AFTER  HIGH  SCHOOL 
S006701    WHAT  OTHER  THINGS  WILL  YOU  DO  AFTER  HIGH  SCHOOL 


Cowputer  exposure  and  use 

Coaputer  exposure  and  use 

5002901  DO  YOU  USE  A  COMPUTER  AT  HOME 

5002902  DO  YOU  USE  A  COMPUTER  AT  THE  LIBRARY 

5002903  DO  YOU  USE  A  COMPUTER  AT  A  FRIENDS  HOUSE 

5002904  BOW  OFTEN  DO  YOU  USE  A  COMPUTER  AT  SCHOOL 

5003001  DO  YOU  USE  A  COMPUTER  TO  PLAY  GAMES 

5003002  DO  YOU  USE  A  COMPUTER  TO  LEARN  THINGS 

S003?a3  DO  YOU  USE  A  COMPUTER  TO  WRITE  STORIES  OR  PAPERS 

S003101  HOW  OFTEN  DO  YOU  WRITE  COMPUTER  PROGRAMS 


Use  of  tiae 

Time  spent  on  homework 

B001701    HOW  MUCH  TIME  DID  YOU  SPEND  ON  HOMEWORK  YESTERDAY 

TV  watching 

B001801    HOW  MUCH  TELEVISION  DO  YOU  WATCH  EACH  DAY 

How  much  free  time 

S006801    HOW  MUCH  FREE  TIME  ON  AVERAGE  SCHOOL  DAY 

Use  of  free  time 


S005001 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

WATCH  TV 

S005002 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

READ  A  BOOK 

S005003 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

WRITE  IN  DIARY 

S005004 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

CALL  A  FRIEND 

S005005 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

BE  WITH  FRIENDS 

S005006 

WP.EN 

FREE 

TIME, 

HOW 

OFTEN 

GO  SHOPPING 

S005007 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

PUY  A  SPORT 

S005008 

WHEN 

pp»'E 

TIME, 

HOW 

OFTEN 

GO  HUNTING  OR  FISHING 

S005009 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

TAKE  A  WALK 

S005010 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

WORK  AT  A  COMPUTER 

S005011 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

PUY  VIDEO  GAMES 

S005012 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

READ  A  NEWSPAPER 

S005013 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

GET  A  SNACK 

S005014 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

DO  EXTRA  HOMEWORK 

S005015 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

WRITE  A  LETTER 

S00S016 

WHEN 

FREE 

TIME, 

HOW 

OFTEN 

LISTEN  TO  MUSIC 

658 


ErIc  704 


Table  A(2) 
(continued) 


5005017  VHEN  FREE  TIME,  HOW  OFTEN  DO  SOMETHING  ELSE 

5005018  VHEN  FREE  TIME,  WHAT  IS  IT 

5005019  WHEN  FREE  TIME  WHAT  ACTIVITY  SPEND  MOST  TIME 


Activities 
S003601 
S003602 
S003603 
3003604 
S003605 
S003606 
S003607 
S003608 
S003609 
S003610 
S003611 
S003612 
S003613 
S003614 


HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
HOW  OFTEN  DO  YOU 
WHAT  ACTIVITY  DO 


GO  TO  A  MOVIE 

GO  TO  A  PUY 

GO  TO  A  CONCERT 

GO  TO  A  PARTY 

GO  TO  PUBLIC  LIBPJWY 

TRAVEL  TO  A  PUCE  AWAY  PROM 

GO  SHOPPIi;G 

GO  TO  A  SPORTS  EVENT 

PUY  CARD  OR  TABLE  GAMES 

VISIT  REUTIVES 

GO  TO  A  MUSEUM 

GO  CAMPING 

STAY  HOME  ALONE 

YOU  DO  MOST  OFTEN 


HOME 


Orientation  to  school 


Bored 


S003701    DO  YOU  EVER  FEEL  BORED  AT  SCHOOL 


Sanctions 
S003801 
S003802 
S003803 
S003804 
S00380S 
S003806 

AbsenteeisB 

S004001 

Lateness 

S004101 


DURING  PAST  YEAR  HOW  OFTEN  SENT  TO  PRINCIPALS  OFF 
DURING  PAST  YEAR  HOW  OFTEN  PUCED  ON  PROBATION 
DURING  PAST  YEAR  HOW  OFTEN  GIVEN  A  DETENTION 
DURING  PAST  YEAR  HOW  OFTEN  WARNED  ABOUT  ATTENDANC 
DURING  PAST  YEAR  HOW  OFTEN  WARNED  ABOUT  GPJ^DZS 
DURING  PAST  YEAR  HOW  OFTEN  WARNED  ABOUT  BEHAVIOR 


HOW  MANY  DAYS  OP  SCHOOL  MISSED  UST  MONTH 


HOW  MANY  TIMES  UTE  FOR  SCHOOL  UST  MONTH 


Ratings  of  school 

5006201  RATE  SCHOOL: PREPARING  FOR  COLLEGE 

5006202  RATE  SCHOOL: PREPARING  FOR  CAREER 

5006203  RATE  SCHOOL: PREPARING  FOR  LIFE 

5006204  RATE  SCHOOL: EXTRACURRICULAR  ACTIVITIES-VARIETY 

5006205  RATE  SCHOOL: EXTRACURRICULAR  ACTIVITIES-QUALITY 


659 


705 


Table  A(2) 
(continued) 


5006206  RATE  SCHOOL: FACULTY  INTEREST 

5006207  RATE  SCHOOL* QUALITY  OP  FACULTY 
S00620e  RATE  SCHOOL i QUALITY  OF  STUDENT  LIFE 

Ratings  of  own  school  experience 

5006301  TRUE  OR  FALSBt  SATISFIED  VITH  PROGRESS  OP  EDUCAT 

5006302  TRUE  OR  FALSEt  NOT  LEARNING  WHAT  NEED  TO  KNOV 

5006303  TRUE  OR  FALSE:  HAD  DISCIPLINARY  PROBLEMS  IN  PAST 

5006304  TRUE  OR  FALSE:  AH  INTERESTED  IN  SCHOOL 

5006305  TRUE  OR  FALSE:  EVERY  ONCE  IN  A  VEILS  CUT  A  j:USS 

5006306  TRUE  OR  FALSE:  DO  NOT  PEEL  SAFE  AT  THIS  SCHOOL 

5006307  TRUE  OR  FALSE:  VISH  COULD  GO  TO  DIFFERENT  SCHOOL 


Reading  and  Vriting  Backgroi 


Student  perceptions  of  instructional  practices  in  reading  and  writing 

School  vriting  assignacnts 

B002401    REPORTS  AND  ESSAYS  VRITTEN  FOR  SCHOOL  UST  6  VEEK 
SOOOlOl    TIME  SPENT  IN  ENGLISH  CUSS  LEARNING  TO  VRITE 
S000201    REPORTS  AND  PAPERS  VRITTEN  FOR  SCHOOL  UST  6  VEEK 
S000301    VRITINGS  DONE  UST  VEEK  FOR  SOCIAL  STUDIES  CUSS 
S000401    VRITINGS  DONE  UST  VEEK  FOR  SCIENCE 

How  teacher  assists  in  vriting 

5000601  VHEN  VRITING  HOV  OFTEN  TEACHER  ASKS  TO  HAKE  NOTES 

5000602  VHEN  VRITING  HOV  OFTEN  TEACHER  ASKS  HAKE  OUTLINE 

5000603  VHEN  VRITING  HOV  OFTEN  TEACHER  ASKS  NOTE  CHANGES 

5000604  VHEN  VRITING  HOV  OFTEN  TEACHER  ASKS  TALK  DURING 

5000605  VHEN  VRITING  HOV  OFTEN  TEACHER  ASKS  TALK  MATES 

5000606  VHEN  VRITING  HOV  OFTEN  TEACHER  ASK  REDO  BEFOR  GRD 

5000607  VHEN  VRITING  HOV  OFTEN  TEACHER  ASK  REDO  AFTER  GRD 

Teacher  feedback  after  vriting 

B002604  HOV  OFTEN  DOES  TEACHER  VRITE  SUGGESTIONS  ON  PAPER 

B002605  HOV  OFTEN  DOES  TEACHER  DISCUSS  FINISHED  PAPERS 

5001701  HOV  OFTEN  DOES  TEACHER  ASK  IF  YOU  FOLLOVED  DIRECT 

5001702  flOV  OFTEN  DOES  TEACHER  ASK  IF  YOU  VROTE  ENOUGH 
5^^1703  HOV  OFTEN  DOES  TEACHER  ASK  YOUR  IDEAS  IN  PAPER 
3001704  HOV  OFTEN  DOES  TEACHES  ASK  EXPLANATIONS  IN  PAPER 

5001705  HOV  OFTEN  DOES  TEACHER  ASK  EXPRESS  FEELINGS  PAPER 

5001706  HOV  OFTEN  DOES  TEACHER  ASK  ORGANIZATION  IN  PAPER 

5001707  HOV  OFTEN  DOES  TEACHER  ASK  WORDS  YOU  USED  IN  PAPE 


660 


ERIC 


706 


Table  A(2) 
(continued) 


5001708  HOW  OFTEN  DOES  TEACHER  ASK  SPELLING,  GRAM  IN  PAPE 

5001709  HOW  OFTEN  DOES  TEACHER  ASK  YOUR  NEATNESS  IN  PAPER 

5002501  HOW  OFTEN  DOES  TEACHER  MARK  ERRORS  ON  PAPERS 

5002502  HOW  OFTEN  DOES  TEACHER  WRITE  NOTES  ON  PAPERS 

5002503  HOW  OFTEN  DOES  TEACHER  POINT  OUT  GOOD  THINGS 

5002504  HOW  OFTEN  DOES  TEACHER  POINT  OUT  NOT  GOOD  THINGS 

5002505  HOW  OFTEN  DOES  TEACHER  MAKE  SUGGESTIONS  FOR  NEXT 

5002506  HOW  OFTEN  DOES  TEACHER  SHOW  INTEREST  IN  WRITING 

Teacher  behavior  around  reading 

5004601  HOW  OFTEN  WITH  NEW  READING  TEACHER  POINT  HARD  WOR 

5004602  HOW  OFTEN  WITH  NEW  READING  TEACHER  PREVIEW  READIN 

5004603  HOW  OFTEN  WITH  NEW  READING  TEACHER  READ  PART  ALOU 

5004701  HOW  OFTEN  DOES  TEACHER  LIST  OF  QUESTS  AS  YOU  READ 

5004702  HOW  OFTEN  DOES  TEACHER  TELL  HOW  TO  FIND  MAIN  IDEA 

5004703  HOW  OFTEN  DOES  TEACHER  TELL  HOW  TO  READ  FASTER 

Teacher  behavior  around  writing 

B002601  HOW  OFTEN  ENCOURAGED  MAKE  NOTES  ON  TOPIC  OF  PAPER 

B002602  HOW  OFTEN  ENCOURAGED  TO  MAKE  OUTLINES  OF  PAPER 

Time  spent  learning  to  write 

B'^02501  PART  OF  CLASS  TIME  SPENT  LEARNING  TO  WRITE  REPORT 

B002603  HOW  OFTEN  DO  YOU  WRITE  PAPER  MORE  THAN  ONCE 

B002606  HOW  OFTEN  DO  YOU  IMPROVE  PAPER  AFTER  RETURN 


Self-assessment  as  reader  and  writer 

Self-assessment  as  reader 

S003301    WHAT  KIND  OF  READER  ARE  YOU 

Self-assessment  as  writer 

B002607  DO  YOU  ENJOY  WORKING  ON  WRITING  ASSIGNMENTS 

5001201  HOW  OFTEN  IS  TRUE:  LIKE  TO  WRITE 

5001202  HOW  OFTEN  IS  TRUE:  AM  A  GOOD  WRITER 

5001203  HOW  OFTEN  IS  TRUE:  THINK  WRITING  IS  WASTE  OF  TIME 

5001204  HOW  OFTEN  IS  TRUE:  PEOPLE  LIKE  WHAT  I  WRITE 

5001205  HOW  OFTEN  IS  TRUE:  WRITE  ON  OWN  AWAY  FROM  SCHOOL 

5001206  HOW  OFTEN  IS  TRUE:  DISLIKE  WRITING  TO  BE  GRADED 

5001207  HOW  OFTEN  IS  TRUE:  WOULDNT  WRITE  IF  NOT  FOR  SCHOO 


661 

707 


Table  A(2) 
(continued) 


Student  study  habits  and  reading  and  writing  behavior 

Pages  read  for  school  and  homework 

BOOllOl    HOW  HANY  PAGES  READ  IN  SCHOOL  AND  FOR  HOHEWORK 


Frequency  of 
B001201 
B001202 
B001203 
B001204 
B001205 
B001206 
B001207 
B001208 
S000501 
S001901 
S001902 
S0O1903 
S001904 
S001905 


kinds  of  writing 
STORIES  WRITTEN  FOR  ENGLISH  LAST  WEEK 
ESSAYS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
POEHS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
PLAYS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
LETTERS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
BOOK  REPORTS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
OTHER  REPORTS  WRITTEN  FOR  ENGLISH  CLASS  LAST  WEEK 
I  DO  NOT  HAVE  AN  ENGLISH  CLASS 
WRITINGS  DONE  LAST  WEEK  NON- SCHOOL  RELATED 
HOW  OFTEN  DO  YOU  WRITE  A  BOOK  REPORT 
HOW  OFTEN  DO  YOU  WRITE  ABOUT  SCIENCE  EXPERIMENT 
HOW  OFTEN  DO  YOU  WRITE  LETTER  TO  A  RELATIVE 
HOW  OFTEN  DO  YOU  WRITE  NOTES  OR  MESSAGE 
HOW  OFTEN  DO  YOU  WRITE  STORY  THAT  NOT  HOMEWORK 


Last  thing  read  on  own 

B001401    WHAT  WAS  THE  LAST  THING  YOU  READ  FOR  SCHOOL 
B001501    WHAT  WAS  THE  LAST  THING  YOU  READ  ON  YOUR  OWN 
B001502    OTHER  THING  YOU  READ  ON  YOUR  OWN 


Frequency  of  kinds  of  reading  behavior 


S004301 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

STORY  OR  NOVEL 

S004302 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

POEM 

S004303 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

PLAY 

S004304 

HOW 

OFTEN 

DO 

YOU 

REAL' 

A 

NEWSPAPER 

S004305 

HOW 

OFTEN 

DO 

YOU 

READ 

A  MAGAZINE 

S004306 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

SCIEL^CE  BOOK 

S004307 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

BIOGRAPHY 

S004308 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

HOW-TO-DO  BOOK 

S004309 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

BOOK  ABOUT  OTHER  TIMES 

S004310 

HOW 

OFTEN 

DO 

YOU 

READ 

A 

SPORTS  BOOK 

S004311 

HOW 

OFTEN 

DO 

YOU 

READ 

WORDS  OF  SONG 

Behavior  around  writing 

5000901  WHEN  WRITING  HOW  OFTEN  ASK  SELF  SUBJECT  PAPER 

5000902  WHEN  WRITING  HOW  OFTEN  LOOK  UP  FACTS  IN  BOOKS 

5000903  WHEN  WRITING  HOW  OFTEN  THINK  BEFORE  WRITING 

5000904  WHEN  WRITING  HOW  OFTEN  THINK  ABOUT  LAYOUT 

5000905  WHEN  WRITING  HOW  OFTEN  USE  DIFF  STYLES  PER  PERSON 

5000906  WHEN  WRITING  HOW  OFTEN  MAKE  CHANGES  AS  YOU  WRITE 


662 


708 


Table  A(2) 
(continued) 


S000907 
SOOlOOl 
S001002 
S001003 
S001301 
S001302 
S001303 
S001304 
S001305 
S001306 
S001307 
S001308 
S001309 
S001601 
S001602 
S001603 
S001604 
S001605 
S001606 
S001607 
S0O16O8 
S001609 


WHEN  WRITING  HOW  OFTEN  MAKE  CHANGE  AFTER  WRITING 
HOW  OFTEN  HAVE  YOU  SHOWN  FRIENDS  YOUR  WRITINGS 
HOW  OFTEN  HAVE  PAPERS  BEEN  PRINTED  IN  SCHOOL  PAPE 
HOW  OFTEN  DOES  YOUR  FAMILY  READ  YOUR  PAPERS 

MOVE  SENTENCES  AROUND 
ADD  NEW  IDEAS  OR  INFORMATION 
TAKE  OUT  UNDESIRED  PARTS 
CHANGE  WORDS 

CORRECT  SPELLING  MISTAKES 
CORRECT  GRAMMAR  MISTAKES 
CORRECT  PUNCTUATION  MISTAKES 
REWRITE  MOST  OF  PAPER 
THROW  OUT  AND  START  OVER 
HOW  OFTEN  DO  YOU  LIST  THINGS  TO  BUY 
HOW  OFTEN  DO  YOU  COPY  RECIPE  OR  DIRECTIONS 
HOW  OFTEN  DO  YOU  FILL  OUT  ORDER  BLANKS 
HOW  OFTEN  DO  YOU  KEEP  A  DIARY  OR  JOURNAL 
HOW  OFTEN  DO  YOU  DO  A  CROSSWORD  PUZZLE 
HOW  OFTEN  DO  YOU  HELP  OTHER  STUDENTS  WITH  WRITING 
HOW  OFTEN  DO  YOU  WRITE  ABOUT  WHAT  YOU  HAVE  READ 
HOW  OFTEN  DO  YOU  WRITE  PAPERS  TOO  PERSONAL  TO  SHO 
HOW  OFTEN  DO  YOU  WRITE  FOR  SCHOOL  NEWSPAPER 


HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 
HOW  OFTEN  IS  TRUE 


Behavior  around  reading 

5003501  HOW  OFTEN  DO  YOU  READ  FOR  FUN  ON  YOUR  OWN  TIME 

5003502  HOW  OFTEN  DO  YOU  TELL  FRIEND  ABOUT  A  GOOD  BOOK 

5003503  HOW  OFTEN  DO  YOU  TAKE  BOOKS  OUT  OF  THE  LIBRARY 

5003504  HOW  OFTEN  DO  YOU  SPEND  YOUR  OWN  MONEY  ON  BOOKS 

5003505  HOW  OFTEN  DO  YOU  READ  BOOK  BASED  ON  MOVIE  YOU  SAW 

5003506  HOW  OFTEN  DO  YOU  READ  BOOKS  BY  AN  AUTHOR  YOU  LIKE 

5004401  HOW  OFTEN  DOES  SOMEONE  READ  ALOUD  TO  YOU 

5004402  HOW  OFTEN  DO  YOU  READ  ALOUD  TO  SOMEONE 

5005201  HOW  OFTEN  DO  YOU  READ  ALOUD  IN  SCHOOL 

5005202  HOW  OFTEN  DO  YOU  READ  ON  OWN  IN  SCHOOL 

5005203  HOW  OFTEN  DO  YOU  WORK  IN  A  WORKBOOK 


Studying  for 
S005101 
S005102 
S005103 
S005104 
S005105 
S005106 


tests 
HOW  OFTEN  WHEN 
HOW  OFTEN  WHEN 
HOW  OFTEN  WHEN 
HOW  OFTEN  WHEN 
HOW  OFTEN  WHEN 
HOW  OFTEN  WHEN 


STUDY  FOR  TEST 
STUDY  FOR  TEST 
STUDY  FOR  TEST 
STUDY  FOR  TEST 
STUDY  FOR  TEST 
STUDY  FOR  TEST 


READ  OVER  MATERIAL 
TAKE  NOTES  ON  READ 
MAKE  OUTLINES 
QUES  IN  TEXTBOOK 
ANSWER  OWN  QUESTNS 
QUESTION  OTHERS 


Use  of  library 

S005301    HOW  OFTEN  GO  TO  LIBRARY  TO  READ  ON  OWN 


663 


709 


Table  A(2) 
(continued) 


5005302  HOW  OFTEN  GO  TO  LIBRARY  TO  LOOK  UP  FACT  FOR  SCHOO 

5005303  HOW  OFTEN  GO  TO  LIBRARY  TO  FIND  BOOKS  FOR  HOBBIES 

5005304  HOW  OFTEN  GO  TO  LIBRARY  FOR  QUIET  PLACE  TO  READ 

5005305  HOW  JFTEN  GO  TO  LIBRARY  TO  TAKE  OUT  BOOKS 


Behavior  around  writing  in  school 

5002001  WHAT  WAS  THE  LAST  THING  YOU  WROTE  IN  SCHOOL 

5002002  LAST  WRITING  IN  SCHOOL:  COPY  OVER  BEFORE  SUBMITIN 

5002003  LAST  WRITING  IN  SCHOOL:  MAKE  CHANGES  BEFORE  SUBMI 

5002004  LAST  WRITING  IN  SCHOOL:  MAKE  CHANGES  AFTER  RETURN 

5002005  LAST  WRITING  IN  SCHOOL:  LIKE  DOING  THE  WRITING 


Student  orientation  toward  usefulness  of  reading  and  writing 

Student  orientation  toward  usefulness  of  writing 

5000701  HOW  OFTEN  IS  TRUE:  WRITING  IS  IMPORTANT 

5000702  HOW  OFTEN  IS  TRUE:  WRITING  HELPS  LEARN  MYSELF 

5000703  HOW  OFTEN  IS  TRUE:  WRITING  REMINDS  ABOUT  THINGS 

5000704  HOW  OFTEN  IS  TRUE:  WRITING  HELPS  ME  STUDY 

5000705  HOW  OFTEN  IS  TRUE:  WRITING  HELPS  NEW  IDEAS 

5001401  HOW  OFTEN  IS  TRUE:  GOOD  WRITING  GETS  A  BETTER  JOB 

5001402  HOW  OFTEN  TRUE:  GOOD  WRITIIIG  INFLUENTIAL 

5001501  HOW  OFTEN  TRUE:  WRITING  HELPS  THINK  MORE  CLEARLY 

5001502  HOW  OFTEN  TRUE:  WRITING  HELPS  TELL  OTHERS  THINKIN 

5001503  HOW  OFTEN  TRUE:  WRITING  HELPS  TELL  OTHERS  FEELING 

5001504  HOW  OFTEN  TRUE:  WRITING  HELPS  UNDERSTAND  MYSELF 


Student  orien 
S004201 
S004202 
S004203 
S004204 
S004205 
S004206 
S004207 
S004208 
S004209 
S004210 
S004211 
S004801 
S004802 
S004803 
S004804 


tation  toward  usefulness  of  reading 


HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 
HOW  OFTEN  READING 


HELPS  ME  DECIDE  WANT  TO  BE 
HELP  ME  LEARN  TO  FIX  THINGS 
HELPS  UNDERSTAND  PEOPLES  ACTIO 
READING  IS  IMPORTANT 
BETTER  FEWER  HARD  WORDS 
BETTER  FEWER  LONG  SENTENCES 
BETTER  IF  IT  MATTERED  TO  ME 
BETTER  IF  TEACH  GAVE  MORE  TIME 
BETTER  IF  DIDNT  HAVE  SO  MUCH 
BETTER  IF  WASNT  TESTED  ON  IT 


LIKE  MORE  IF  COULD  TALK  W  OTHE 

HOW  OFTEN  TRUE:  WRITING  HELPS  ME  GET  A  GOOD  JOB 

HOW  OFTEN  TRUE:  WRITING  HELPS  ME  SHARE  MY  IDEAS 

HOW  OFTEN  TRUE:  WRITING  HELPS  SHOW  I  KNOW  THINGS 

HOW  OFTEN  TRUE:  WRITING  HELPS  KEEP  IN  TOUCH  FRIEN 


66A 


710 


Table  A(2) 
(continued) 


Student^s  experiential  base  for  vriting 

Student's  experiential  base  for  vriting 

S005401    HOW  OFTEN  DO  YOU  WATCH  NEWS  ON  TELEVISION 
S0034O2    HOW  OFTEN  DO  YOU  READ  A  NEWS  MAGAZINE 

5005403  HOW  OFTEN  DO  YOU  READ  NEWSPAPER  NOT  COMICS  OR  SPR 

5005404  HOW  OFTEN  DO  YOU  LISTEN  TO  NEWS  ON  RADIO 


Reading  and  vriting  behavior  of  people  in  student's  home 

Reading  behavior  of  people  in  student's  home 

S004501    HOW  OFTEN  DOES  FAMILY  READ  NEWSPAPERS 


S004502 

HOW 

OFTEN 

DOES 

FAMILY 

READ  MAGAZINES 

S004503 

HOW 

OFTEN 

DOES 

FAMILY 

READ  BOOKS 

S004504 

HOW 

OFTEN 

DOES 

FAMILY 

READ  RECIPES 

Lng  behavior  of  people  in  student's  home 

SOOllOl 

HOW 

OFTEN 

DOES 

FAMILY 

LIST  THINGS  TO 

DO 

S001102 

HOW 

OFTEN 

DOES 

FAMILY 

COPY  RECIPES  OR  DIRECTIONS 

S001103 

HOW 

OFTEN 

DOES 

FAMILY 

FILL  OUT  ORDER 

BLANKS 

S001104 

HOW 

OFTEN 

DOES 

FAMILY 

WRITE  CHECKS 

S001105 

HOW 

OFTEN 

DOES 

FAMILY 

KEEP  DIARIES 

S001106 

HOW 

OFTEN 

DOES 

FAMILY 

WORK  CROSSWORD 

PUZZLE 

S001801 

HOV 

OFTEN 

DOES 

FAMILY 

WRITE  LETTER  TO  A  RELATIVE 

S001802 

HOW 

OFTEN 

DOES 

FAMILY 

WRITE  NOTES  OR 

MESSAGES 

S001803 

HOV 

OFTEN 

DOES 

FAMILY 

WRITE  STORY  OR 

POEM 

S001804 

HOV 

OFTEN 

DOES 

FAMILY 

WRITE  BUSINESS 

LETTER 

665 


711 


Tabl«  A(3) 

Background  and  Attitude  zt^as  and  Locations  (Spiral) 


R    HASP  ZD 


DE8CRZPTZ01I 


1. 

BOOOlOl 

2. 

B000201 

3. 

B000301 

4. 

B000302 

5. 

B000401 

6. 

B0004<)2 

7. 

B000501 

1. 

B000502 

9. 

B000503 

10. 

B000601 

11. 

B000701 

12. 

B000801 

13. 

B000901 

14. 

B000902 

15. 

B000)03 

16. 

B000904 

17. 

B000905 

18. 

B000906 

19. 

B000907 

20. 

BOOlOOl 

21. 

B001002 

22. 

B001003 

23. 

B001004 

24. 

B001005 

25. 

B001006 

26. 

B001007 

27. 

BOOllOl 

2B. 

B001201 

29. 

B001202 

30. 

B001203 

31. 

B001204 

32. 

B001205 

33. 

B001206 

34. 

B001207 

35. 

8001208 

36. 

8001401 

17. 

8001501 

3B. 

8001701 

39. 

8001101 

40. 

8001901 

41. 

8002001 

42  . 

8002701 

43. 

8002702 

44. 

8002703 

45. 

8002704 

ETHNZCZTT 

ARE  TOU  HZSPANZC 

WHAT  LANGUAGE  DO  YOU  SPEAK  MOST  OPTEH  ZN  HOME 

OTHER   LANGUAGE  lOU  SPEAK  MOST  OFTEN  ZN  HOME 

WHAT  LANGUAGE  DO  OTHERS  SPEAK  MOST  OFTEN   ZN  HOME 

OTHER   LANGUAGE  OTHERS  SPEAK  MOST  OFTEN  ZN  HOME 

FZRST  OTHER  LANGUAGE  TOU  KNOW 

SECOND  OTHER  LANGUAGE  TOU  KNOW 

THZRD  OTHER  LANGUAGE  TOU  KNOW 

HOW  FAR  ZN  SCHOOL  DZD  TOUR   FATHER  GO 

ROW  FAR   ZN  SCHOOL  DZD  TOUR  MOTHER  GO 

DOES  YOUR  MOTHER  WORK  OUTSZDE  TOUR  HOME 

DOES  TOUR  FAMZLT  GET  A  NEWSPAPER  REGULARLY 

ZS  THERE  A  DZCTZONART   ZN  TOUR  HONE 

ZS  THERE  AN  ENCTCLOPEDZA  ZN  TOUR  HOME 

ARE  THERE  MORE  THAN  25   BOOKS   ZN  TOUR  HOME 

DOES  TOUR  FAMZLT  GET  MAGAZZNES  REGULARLY 

IS  THERE  A  VZDEO  GAME   ZN  TOUR  HOME 

18  THERE  A  COMPUTER  ZN  TOUR  HOME 

DO  YOU  HAVE  GTM  ONCE   PER  WEEK 

DO  YOU  HAVE  ART  ONCE  PER  WEEK 

DO  YOU  RAVE  MUSZC  ONCE   PER  WEEK 

DO  YOU  HAVE   FOREZGN  LANGUAGE  ONCE   PER  WEEK 

DO  YOU  HAVE   COMPUTER  CLASS  ONCE   PER  WEEK 

DO  TOU  HAVE   DRAMA  CLASS  ONCE  PER  WEEK 

DO  TOU   HAVE  SCZENCE  ONCE  PER  WEEK 

HOW  MANY  PAGES  »£AD  ZN  SCHOOL  AND  FOR  HOMEWORK 

STORZES  WRZTTEN  FOR  ENGLZSH  LAST  WEEK 

ESSATS  WRZTTEN  FOR  ENGLZSH  CLASS   LAST  WEEK 

POEMS  WRZTTEN  FOR  ENGLZSH  CLASS   LAST  WEEK 

PLAYS  WRZTTEN  FOR  ENGLZSH  CLASS   LAST  WEEK 

LETTERS  WRZTTEN  FOR  ENGLZSH  CLASS   LAST  WEEK 

800K  REPORTS  WRZTTEN  FOR  ENGLZSH   CLASS   LAST  WEEK 

OTHER  REPORTS  WRZTTEN   FOR  ENGLZSH   CLASS  LAST  WEEK 

Z   DO  NOT  HAVE  AN  ENGLZSH  CLASS 

WHAT  WAS  THE   LAST  THZNG  YOU  READ  FOR  SCHOOL 

WHAT  WAS  THE   LAST  THZNG  YOU  READ  ON  YOUR  OWN 

HOW  MUCH  TZME  DZD  YOU  SPEND  ON  HOMEWORK  YESTERDAY 

HOW  MUCH  TELEVZSZON  DO  YOU  WATCH  EACH  DAY 

WHZCH  DESCRZ8ES  YOUR  GRADES   ZN  SCHOOL 

HOW  MANY  DZFFERENT  TOWNS  HAVE  YOU  LZVED  ZN 

COURSE  WORK  COMPLETED:  MATHEMATZCS 

COURSE  WORK  COMPLETED:    ENGZ.ZSH  OR  LZTERATURE 

COURSE  WORK  COMPLETED:  JOURNALZSM 

COURSE  WORK  COMPLETED:    FOREZGN  LANGUAGE 


Grad«  4/A9«  9 

Grad«  8/A90  13 

Grad« 

CB-01 

CB-01 

CB-01 

CB-02 

CB-02 

CB-02 

CB-03 

CB-03 

CB-03 

CB-03 

CB-03 

CB-03 

CB-0  4 

CB-04 

CB-04 

CB-04 

CB-04 

CB-04 

CB-05 

CB-05 

CB-05 

CB— 0  5 

CB-05 

CB-05 

CB— 0  5 

CB-05 

CB-05 

CB-08 

CB-08 

CB-08 

CB-07 

CB-07 

CB-07 

C3-06 

CB-06 

CB-06 

CB-09 

CB-09 

CB-09 

CB-10 

CB-10 

CB-10 

CB  —  1 1 

CB-11 

CB-11 

CB  —  1  2 

CB-12 

CB-12 

CB  — 1  3 

CB-13 

CB-13 

CB  —  1  4 

CB-14 

CB-14 

CB  —  1 5 

CB-15 

CB-15 

CB-1  7 

CB-17 

CB-17 

CB  —  1 8 

CB-18 

CB-18 

CB  —  1 9 

CB-19 

CB-19 

CB—  2  0 

CB-20 

CB-20 

CB  —  2 1 

CB-21 

CB-21 

CB—  2  2 

CB-22 

CB-22 

CB-23 

CB-23 

CB-23 

CB-2  4 

CB-24 

CB-24 

CB—  2  5 

CB-25 

CB-25 

CB-26 

CB-26 

CB-26 

CB-  2  7 

CB— 2  7 

CB-27 

CB—  2  8 

CB  —  2  8 

CB-28 

CB—  2  9 

CB  —  2  9 

CB-29 

CB  —  3  0 

CB— 3  0 

CB-30 

CB-31 

CB-31 

CB-31 

CB-32 

CB-32 

CB-32 

CB-34 

CB-34 

CB-34 

CB-35 

CB-35 

CB-35 

CB-33 

CB-33 

CB-33 

CB-37 

CB-37 

CB-37 

CB-36 

CB-36 

CB-36 

CB-16 

CB-X6 

CB-16 

CB-38 

CB-39 

CB-40 

CB-41 

o  712 
ERIC  '^"^ 


667 


71 


Table  A(3) 

Background  and  Attitud*  Itmms  and  Locations  (Spiral) 


II    HASP  ZD 


DESCRIPTIOII 


Grad*   4/Ag«  9 


Grad*  8/Ag«  13 


46. 

B002705 

47. 

B002706 

4t. 

B002707 

49. 

B002708 

50. 

B002709 

51. 

B002710 

52. 

B002801 

53. 

SOOOlOl 

54. 

S000201 

55. 

S000301 

56. 

S000401 

57. 

S000501 

5$. 

S000601 

59. 

S000L02 

60. 

S000603 

€1. 

S000604 

62. 

S000605 

63. 

S000606 

64. 

S000607 

65. 

S000701 

66. 

S000702 

67. 

S000703 

6«. 

S000704 

69. 

S000705 

70. 

S000901 

71. 

S000902 

72. 

S000903 

73. 

S000904 

74. 

8000905 

75. 

S000906 

76. 

S000907 

77. 

SOOlOOl 

78. 

S001002 

79. 

S001003 

SO. 

8001101 

•  1. 

8001102 

S2. 

8001103 

83. 

S001104 

84. 

S001105 

85. 

S001106 

86. 

8001201 

87. 

8001202 

88. 

S001203 

89. 

S001204 

^•0. 

S001205 

COURSE  WORK  COMPLETED: 
COURSE  WORK  COMPLETED: 
COURSE  WORK  COMPLETED: 
COURSE  WORK  COMPLETED: 
COURSE  WORK  COMPLETED: 
COURSE  WORK  COMPLETED: 


BISTORT  OR  SOCIAL  STUDIES 
SCIENCE 

COMPUTERS  OR  PROGRAMMING 
BUSINESS  OR  VOCATIONAL 
ARTS 

s.ww«^-   „   MUSIC 

HOURS   PER  WEEK  WORKING  IN  PART-TIME  JOB 
TIME   SPENT   IN  ENGLISH  CLASS  LEARNING  TO  WRITE 
REPORTS  AND  PAPERS  WRITTEN  FOR  SCHOOL  LAST  6  WEEKS 
WRITINGS  DONE  LAST  WEEK  FOR  SOCIAL  STUDIES  CLASS 
WRITINGS  DONE  LAST  WEEK  FOR  SCIENCE 
WRITINGS  DONE  LAST  WEEK  NON-SCHOOL  RELATED 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASKS  TO  MAKE  NOTES 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASKS  MAKE  OUTLINE 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASKS  NOTE  CHANGES 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASKS  TALK  TEACHER 
WHEN  WRITING  HOW  OFTEH  TEACHER  ASKS  TALK  MATES 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASK  REDO  BEFOR  GRD 
WHEN  WRITING  HOW  OFTEN  TEACHER  ASK  REDO  AfTER  CRD 
HOW  OFTEN   IS  TRUE:   WRITING  IS  IMPORTANT 
HOW  OFTEH   IS  TRUE:   WRITING  HELPS   LEARN  ABOUT  SELF 
ROW  OFTEN   IS  TRUE:   WRITING  REMINDS  ABOUT  THINGS 
HOW  OFTEN   IS  TRUE:   WRITING  HELPS  ME  STUDY 
HOW  OFTEN   IS  TRUE:   WRITING  HELPS  NEW  IDEAS 
WHEN  WRITING  HOW  OF"EN  ASK  SELF  SUBJ.SCT  PAPER 
WHEN  WRITING  HOW  OFTEN  LOOK  Ut  FACTS   IN  BOOKS 
WHEN  WRITING  HOW  OFTEN  THINK  BEFORE  WUITING 
WHEN  WRITING  HOW  OFTEN  THINK  ABOUT  0R0ANI2ATI0N 
WHEN  WRITINtJ  HOW  OFTEN  USE  DIFF  STYLES  PER  PERSON 
WHEN  WRITING  HOW  OFTEN  MAKE   CHANGES  AS  YOU  WRITE 
WHEN  WRITING  HOW  OFTEN  MAKE  CHANGES  AFTER  WRITING 
HOW  OFTEN  HAVE  YOU  SHOWN  FRIENDS  YOUR  WRITINGS 
HOW  OFTEN  HAVE   PAPERS  BEEN  PRINTED  lU  SCHOOL  PAPER 
HOW  OFTEH   DOES  YOUR  FAMILY  READ  YOUR  PAPERS 
HOW  OFTEN   DOES   FAMILY  LIST  THINGS  TO  BUY  OR  DO 
HOW  OFTEN  DOES   FAMILY  COPY  RECIPES  OR  DIRECTIONS 
HOW  OFTEN   DOES   FAMILY  FILL  OUT  ORDER  BLANKS 
HOW  OFTEH   DOES   FAMILY  WRITE   CHECKS/KEEP  BUDGETS 
ROW  OFTEN   DOES   FAMILY  KEEP  DIA?IIES  OR  JOURNALS 
HOW  OFTEN   DOES   FAMILY  WORK  CROSSWORD  PU'i^ZLE 
HOW  OFTEN   IS  TRUE:    I   LIKE  TO  WRITE 
HOW  OFTEN   IS  TRUE:    I  AM  A  GOOD  WRITER 
HOW  OFTEN   IS  TRUE:    THINK  WRITING   IS  WASTE  OF  TIME 
HOW  OFTEN   IS  TRUE:    PEOPLE  LIKE  WHAT  I  WRITE 
HOW  OFTEN   IS  TRUE:   WRITE  ON  OWN  AWAY  FROM  SCHOOL 


A-01 

A-02 

A-03 

A-04 

A-05 

A-06 

A-07 

A-08 

A-09 

A-10 

A-11 

A-U 

B-01 

B-02 

B-03 

B-04 

B-05 

B-06 

B-07 

B-08 

B-09 

B-10 

B-11 

B-12 

B-13 

B-14 

B-15 

C-01 

C-02 

C-03 

C-04 

C-05 

C-OC 

C-07 

C-08 

C-09 

C-10 

C-11 


U-08 
U-09 
U-10 
U-11 

U-01 
U-02 
U-03 
U-04 
U-05 
U-06 
U-07 


V-01 
V-02 
V-03 
V-04 
V-05 
V-06 
V-07 
Q-07 
Q-08 
Q-09 


A-01 

A-02 

A-03 

A-  04 

A-05 

A-06 

A-07 

A-08 

A«*09 

A-10 

A-11 

A-12 

B-01 

B-02 

B-03 

B-04 

B-05 

B-06 

B-07 

B-08 

B-09 

B-JO 

B-11 

B-12 

B-13 

B-14 

B-15 

C-01 

C-02 

C-03 

C-04 

C-05 

C-06 

C-07 

C-08 

C-09 

C-10 

C-11 


U-08 
U-09 
U-10 
U-11 

U-01 
U-02 
U-0  3 
U-04 
U-05 
U-06 
U-07 


V-01 
V-0  2 
V-03 
V-04 
V-05 
V-06 
V-07 
1-01 
Y-0  2 
Y-03 


Grade  ll/Age  17 


CB-42 

CB-43 

CB-44 

CB-45 

CB-46 

CB-47 

CB-48 
A-01  U-08 
A-02  U-09 
A-03  U-IU 
A-04  U-ll 
A-0  5 

A-06  U-01 
A-07  U-02 
A-08  U-03 
A-09  U-0< 
A-10  U-05 
A-11  U-06 
A-12  U-07 
B-01 
B-02 
B-03 
B-04 
B-05 

B-06  V-01 
B-07  V-02 
B-08  V-03 
B-09  V-04 
B-10  V~05 
B-11  V-06 
B-12  V-07 
B-13  Y-01 
B-14  Y-02 
B-15  Y-03 
C-01 
C-02 
C-03 
C-04 
C-05 
C-06 
C-07 
C-08 
C-09 
C-10 
C-ll 


668 


715 


Tabl*  A(3) 

Background  and  Attitude  Xttaa  and  Locations  (Spiral) 


• 

NAEP  ZD 

DESCRZPTZON 

Grad« 

VAg*  9 

Grad« 

8/Ag«  13 

Grad« 

ll/Ag« 

91. 

S001206 

HOW 

OFTEN 

ZS  TRUE:    DZSLZKE  WRZTZNG  TO  BE  GRADED 

C-12 

c-12 

C-12 

92. 

8001207 

now 

OFTEN 

ZS  TRUE:   WOULDNT  WRZTE   ZF  NOT  FOR  SCHOOL 

C-13 

C-13 

C-13 

93. 

S001301 

HOW 

OFTEN 

ZS  TRUE:   MOVE  SENTENCES  AROUND 

C-14 

X-01 

C-14 

X- 

01 

C-1  4 

X- 

01 

94. 

S001302 

HOW 

OFTEN 

ZS  TRUE:   ADD  *fEW  ZDEAS  OR  XltrORMATZON 

C-15 

X- 

02 

C-15 

X- 

02 

C-1  5 

X- 

02 

95. 

S001303 

ROW 

OFTEN 

ZS  TRUE:   TAKE  OUT  UNDESZP.ED  PARTS 

C-16 

X- 

03 

C-16 

X- 

03 

C-16 

X- 

03 

9€. 

S001304 

NOW 

OFTEN 

ZS  TRUE:    CHANGE  WORDS 

C-17 

X- 

04 

C-17 

X- 

04 

C-17 

X- 

04 

>7. 

S001305 

NOW 

OFTEN 

ZS  TRUE:    CORRECT  SPELLZAG  HZSTAKES 

C-18 

X- 

05 

C-18 

X- 

05 

C-18 

X- 

05 

9$. 

S001306 

HOW 

OFTEN 

ZS  TRUE:    CORRECT  GRAHMAR  HZSTAKES 

C-19 

X- 

06 

C-19 

X- 

06 

C-19 

X- 

06 

99. 

S001307 

HOW 

OFTEN 

ZS  TRUE:    CORRECT  PUNCTUATZON  HZSTAKES 

C-20 

X- 

07 

C-20 

X- 

07 

C-20 

X- 

07 

100. 

S001309 

ROW 

OFTEN 

ZS  TRUE:    REWRZTE  MOST  OF  PAPER 

C-21 

X- 

08 

C-21 

X- 

06 

C-21 

X- 

08 

101. 

S001309 

HOW 

OFTEN 

ZS  TRUE:   THROW  OUT  AND  START  OVER 

C-22 

X- 

09 

C-22 

X- 

09 

C-22 

X- 

09 

102. 

S001401 

ROW 

OFTEN 

IS  TRUE:   GOOD  WRZTZNG  GETS  A  BETTER  JOB 

D-01 

D-01 

D-01 

103. 

S001402 

ROW 

OFTEN 

ZS  TRUE:   GOOD  WRZTZNG  MORE  ZNFLUENTZAL 

D-02 

D-02 

D-0  2 

104. 

S001501 

HOW 

OFTEN 

TRUE:   WRZTZNG  HELPS  THINK  M'<RE  CLEARLY 

D-03 

D-03 

D-03 

105. 

S001502 

HOW 

OFTEN 

TRUE:  WRZTZNG  HELPS  TELL  OTHERS  THZNKZKG 

D-04 

D-04 

D-0  4 

106. 

S001503 

HOW 

OFTEN 

TRUE:   WRZTZNG  HELPS  TELL  OTHERS  FEELZHGS 

D-0  5 

D-05 

D-0  5 

107. 

S001504 

ROW 

OFTEN 

TRUE:   WRZTZHG  HELPS  UNDERSTAND  MYSELF 

D-06 

D-06 

D-06 

109. 

S001601 

ROW 

OFTEN 

DO  YOU  LZST  THZNGS  TO  BUY 

D-07 

D-0  7 

D-07 

109. 

S001602 

ROW 

OFTEN 

DO  YOU  COPY  RECIPES  OR  DZRECTZONS 

D-08 

D-06 

D-06 

110. 

S001603 

ROW 

OFTEN 

DO  YOU  FZLL  OUT  ORDER  BLANKS 

D-09 

D-09 

D-09 

111. 

S001604 

ROW 

OFTEN 

DO  YOU  KEEP  A  DZARY  OR  JOURNAL 

D-10 

D-10 

D-10 

112. 

S001605 

HOW 

OFTEN 

DO  YOU  DO  A  CROSSWORD  PUZZLE 

D-11 

D-11 

D-11 

113. 

S001606 

ROW 

OFTEN 

DO  YOU  HELP  OTHER  STUDENTS  WZTR  WRZTZHG 

D-12 

D-12 

D-12 

114. 

S001607 

HOW 

OFTEN 

DO  YOU  WRZTE  ABOUT  WHAT  YOU  RAVE  READ 

D-13 

D-13 

D-13 

115. 

S001609 

HOW 

OFTEN 

DO  YOU  WRZTE  PAPERS  TOO  PERSONAL  TO  SHOW 

D-14 

D-14 

D-14 

116. 

S001609 

ROW 

OFTEN 

7>0  YOU  WRZTE  FOR  SCHOOL  NEWSPAPER 

D-15 

D-15 

D-15 

117. 

$001701 

ROW 

OFTEN 

00%3  TEACHER  TALK  RE:   FOLLOW  DZRECTZONS 

D-16 

w- 

23 

D-16 

w- 

23 

D-16 

w- 

05 

116. 

S001702 

ROW 

OFTEN 

DOES  TEACHER  TALK  RE:   WROTE  ENOUGH 

D-17 

w- 

24 

D-17 

w- 

24 

D-17 

w- 

06 

119. 

5001703 

ROW 

OFTER 

DOES  TEACHER  TALK  RE:    ZDEAS  ZN  PAPER 

D-18 

w- 

25 

D-16 

w- 

25 

D-18 

w- 

07 

120. 

S001704 

HOW 

OFTEN 

DOES  TEACHER  TALK  RE:    EXPLAZN   ZN  PAPER 

D-19 

w- 

26 

D-19 

w- 

26 

D-19 

w- 

08 

121. 

S001705 

ROW 

OFTEN 

DOES  TEACHER  TALK  RE:   FEELZHGS   ZN  PAPER 

D-20 

w- 

27 

D-20 

w- 

27 

D-20 

w- 

09 

122. 

S001706 

HOW 

OFTEN 

DOES  TE/.CHER  TALK  RE:   ORGANZZZRG  PAPER 

D-21 

w- 

28 

D-21 

w- 

28 

D-21 

w- 

10 

123 . 

S001707 

HOW 

OFTEN 

DOES   TEACHER  TALK  RE:   WORDS   ZH  PAPER 

D-2  2 

w- 

29 

D-22 

w- 

29 

D-22 

w- 

11 

124. 

3001708 

ROW 

OF  EN 

DOES  7EAcrER  TALK  RE:    SP,   ORAM  ZN  PAPER 

D-23 

w- 

30 

D-23 

w- 

30 

D-2  3 

w- 

12 

125. 

S001739 

HOW 

OFTEN 

DOES  TEACHER  TALK  RE:   REATNE'^S   IN  PAPER 

D-24 

w- 

31 

D-24 

w- 

31 

D-2  4 

w- 

13 

126. 

SOOltOl 

HOV 

OFTEN 

rOE5   FAMZLY  WP.lTE  LETTERS  TO  RELATZVES 

B-01 

E-01 

E-01 

127. 

sooito : 

HOIf 

OFTliN 

lOES   FAMZLY  WRZTE  NOTES  OR  MESSAGES 

B-02 

E-C2 

E-02 

129. 

sooitos 

HOW 

0/TEN 

DOES   FAMILY  WRZTE  STORZES  OR  POEMS 

C-03 

E-03 

E-03 

129. 

S00lt04 

ROW 

OFTEN 

DOES   FAMILY  WRITE  BUSINESS  LETTERS 

B-04 

E-04 

E-04 

130. 

S00190X 

HOW 

OFTEN 

DO  YOU  WRZTE  A  BOOK  REPORT 

E-05 

E-05 

E-05 

I3i. 

S001902 

HOW 

OFTEN 

DO  YOU  WRITE  ABOUT  SCZENCE  EXPERZMSKT 

E-06 

E-06 

E-06 

132. 

sooi9o: 

HOW 

o;^TEN 

DO  VOU  WRZTE  LETTER  TO  A  RELATZVE 

F-07 

E-07 

E-07 

133. 

S001904 

ROW 

OFTEN 

DO  YOU  WRZTE  NOTES  OR  MESSAGES 

B-08 

E-08 

E-06 

134. 

S001905 

HOW 

OFTEN 

DO  YOU  WRZTE  STORZES  THAT  HOT  HOMEWORK 

B-09 

E-09 

E-09 

135. 

S002001 

WHAT  WAS  THE  LAST  TRZNG  TOU  WROTE  ZN  SCHOOL 

F-01 

V- 

14 

F-01 

V- 

14 

F-01 

V- 

14 

17 


669 


ERIC 


716 


717 


Tabl«  A(3) 

8«ck9round  and  Attitude  Zt«at  and  Locations  (Spiral) 


■AlP  ID 


DISClXrTZOll 


Orad«  4/A9«  9 


Orad«  13 


Orad«  11/A9«  17 


13€. 

137. 

lit. 

i39. 

140. 

141. 

142. 

143. 

144. 

145. 

14€. 

147. 

14$. 

149. 

150. 

151. 

152. 

153. 

154. 

155. 

ISC. 

157. 

150. 

159. 

KG. 

Kl. 

1<2. 

1<3. 

1<4. 

1€5. 

1€€. 

1€7. 
1€«. 
1€9. 
170. 
171, 
172. 
173, 
174. 
175, 
17C, 
177. 
171. 
179 
ItO 

ERIC 


S002002 
S002003 
S002004 
S002005 
S002501 
S002502 
S002503 
S002504 
S002505 
S00250€ 
S002701 
S002702 
S002703 
S002704 
S002901 
S002902 
S002t03 
S002t04 
S002901 
S002902 
S002903 
S002904 
S003001 
S003002 
S003003 
S003101 
S003201 
S003202 
S003203 
S003301 
S003401 
S003501 
8003502 
S003503 
8001504 
8003505 
8003508 
,  8003801 
.  8003802 
.  8003803 
.  8003804 
.  8003805 
.  8003808 
.  8003807 
.  8003800 

718 


LA8T  WtZTXM  XH  8CH00L!  COM  0V1«  88rOK8  8UBMXTXII0 
LA8T  WtXTXaO   XII  SCHOOL:   HAKE  CHA1I088  BCFOtl  8UBMXT 

LAST  mxTxao  xa  school:  make  chaiiobs  aftcb  bbturnd 

LAST  WBXTXBO  XB  SCBOOL:   LXKB  DOXNO  THB  WBXTXllO 
BOW  OFTfB  DOBS  TBACBBB  NABK  BBBOBS  OB  rAFBBS 
BOW  OFTBB  DOBS  TBACBBB  WBXTB  BOTES  OB  FAFEBS 
BOW  OFTBB  DOES  TBACBBB  FOXBT  OUT  GOOD  THXN08 
BOW  OFTBB  DOES  TBACBBB  FOXBT  OUT  BOT  OOOD  THXHOS 
BOW  OFTBB  DOES  TBACBBB  HAKE  8U00E8TX0H8  FOB  HEXT 
BOW  OFTBB  DOES  TBACBBB  SBOW  XBTEBE8T  XB  WBXTXHO 
DXD  TOU  00  TO  KXBDEBOABTEB 
OXD  TOU  00  TO  DAT  CABE 
DXD  TOU  00  TO  BUBSEBT  SCBOOL 
DXD  TOU  00  TO  BEADSTABT 
WBBBE  DXD  TOU  LXVE  AT  AOE  9 
WBBBE  DXD  TOU  LXV8  AT  AOE  9:  STATE 
WBBBE  DXD  TOU  LXVE  AT  AOE  9:  COUBTBT 
BB8XDBBCB  AT  AOE     9  VS.   CUBBBBT  BE8XDCBCE 
COMFUTEB  AT  BOMB 
COMFUTEB  AT  THE  LXBBABT 
*ww  WW-  «  COMFUTEB  AT  A  FBZEBDS  HOUSE 
BOW  OFTBB  DO  TOU  USE  A  COMFUTEB  AT  SCHOOL 
DO  TOU  USE  A  COMFUTEB  TO   FLAT  OAMES 

COMFUTEB  TO  LEABB  T8XB08 
COMFU'-t  TO  WBXTB  ST0BXE8  OB  PAPCBS 
HOW  OFTBB  DO  TOU  «i    .t%  COMPUTES  PB00BAM8 
WBAT  DO  TOU  U8UALLT  DO  AFTBB  SCHOOL 

XF  TOU  00  HOME  AFTBB  SCBOOL,   WHO   X8  U8UALLT  THEHE 

WHAT  OTHEB  ADULT 

WBAT  KXHD  OF  BEADEB  AB£  TOU 

DO  TOU  EXPECT  TO  OBADUATE  FBOM  BXOH  SCHOOL 

ROW  OFTBB  DO  TOU  BEAD  FOB  FUB  OB  TOUB  OWB  TXME 

BOW  OFTBB  DO  TOU  TELL  A  FBXEBD  ABOUT  A  OOOD  BOOK 

BOW  OFTBB  DO  TOU  TAKE  BOOKS  OUT  OF  THE  LXBBABT 

BOW  OFTBB  DO  TOU  SPEBD  TOUB  OWB  MOHET  OB  BOOKS 

BOW  OFTBB  DO  TOU  BEAD  BOOK  BASED  OB  MOVXE  TOU  SAW 

BOW  OFTBB  DO  TOU  BEAD  BOOKS  BT  A»  AUTHOB  TOU  LXKE 

BOW  OFTBB  DO  TOU  00  TO  A  MOVXE 

BOW  OFTBB  DO  TOU  00  TO  A  PLAT 

BOW  OFTBB  DO  TOU  00  TO  A  CORCEBT 

BOW  OFTBB  DO  TOU  00  TO  A  PABTT 

BOW  OFTBB  DO  TOU  00  TO  TBE  PUBLXC  LXBBAAT 

BOW  OFTBB  DO  TOU  TBAVEL  TO  A  PLACE  AWAT  FBOM  BOME 

BOW  OFTBB  DO  TOU  00  8B0PPXB0 

BOW  OFTBB  DO  TOU  00  TO  A  8P0BT8  BVEBT 


DO  TOU  USE  A 
DO  TOU  USE  A 
DO  TOU  USE  A 


DO  TOU  USE  A 
DO  TOU  USE  A 


F-02  V-IS 

F-02  V-IS 

F-02  V-IS 

r-os  v-16 

W     M  9     »     *  w 

F-03  V-16 

F-03  V-16 

F— 04  V— 17 

F-04  V-17 

F-04  V-17 

F— 05  V— 18 

F-OS  V-18 

r-OS  V-18 

Q— 01  V— 19 

O-Ol  V-19 

O-Ol  V-19 

A— n9  V— 3  0 

U  •    V  ~  •  u 

C   ^^2  V-20 

O-02  V-20 

O-OS   V— 21 

0'    3  V-21 

O-03  V-21 

O-04  V-22 

Q-04  V-22 

O-04  V-22 

Q-OS  V-23 

Q-OS  V-23 

0-OS  V-23 

O-06  V-24 

O-06  V-24 

O-06  V-24 

H-01 

H-01 

H-01 

H-02 

H-0  2 

B-02 

H-03 

H-03 

H-03 

H— 0  4 

H-04 

H-04 

B-OS 

H-OS 

H-OS 

H-OS 

H-OS 

H-OS 

H-OS 

H-OS 

J-01 

J-01 

J-01 

J-02 

J-02 

J-02 

J-03 

J-03 

J-03 

J-04 

J-04 

J-04 

J-OS 

J-OS 

J-OS 

J-06 

J-06 

J-06 

J-07 

J-07 

J-07 

J-08 

J-08 

J-03 

J-09 

J-09 

J-10 

J-10 

J-10 

J-10 

K-01 

K-01  V-28 

K-01  V-25 

K-02 

K-02 

K-02 

K-03  V-08 

K-03  V-08 

K-03  V-08 

K-04  V-09 

K-04  V-09 

K-04  V-09 

K-05  V-10 

K-OS  V-10 

K-OS  V-10 

K-06  V-11 

K-06  V-11 

K-06  V-11 

K-07  V-12 

K-07  V-12 

K-07  V-12 

K-08  V-13 

K-08  V-13 

K-08  V-13 

L-01 

L-01 

L-01 

L-02 

L-0  2 

L-0  2 

L-0  3 

L-0  3 

L-0  3 

L-04 

L-04 

L-04 

L-OS 

L-OS 

L-OS 

L-06 

L-06 

L-06 

L-07 

L-07 

L-07 

L-08 

L-08 

L-0  8 

670 


7iy 


Tablt  A(3» 


Background   ird  Attitude  Ittas 

and  Locations 

(Spiral ) 

11 

NAEP  ZD 

DESCRIPTION 

Grad«  4/A9«  9 

Q tads  8 /Ags  1 3 

G  r  a  d() 

191. 

S003609 

HOW 

OPTEN 

DO  TOU 

PLAT  CARD  OR  TABLE  GAMES 

L-0  9 

L-0  9 

T  —  n  Q 
i<—  u  y 

It2. 

S003610 

HOW 

OFTEN 

DO  TOU 

VISIT  RELATIVES 

L-10 

L-10 

r  —  1  n 

ita. 

S003611 

ROW 

OFTEN 

DO  TOU 

00  TO  A  MUSEUM 

L-1 1 

L— 1 1 

T  _  1  1 

194. 

S003612 

ROW 

OFTEN 

DO  TOU 

GO  CAMPING 

L-1  2 

L-1 2 

1«5. 

S003613 

HOW 

OFTEN 

DO  TOU 

STAT  ROME  ALONE 

L-1  3 

L-1 3 

Ij—  X  J 

1$6. 

S003614 

WHAT  ACTIVITY  DO 

TOU  DO  MOST  OFTEN 

L-14 

L-14 

T  —  1  4 
Ij- X  4 

197. 

S003701 

DO  TOU  BVCR  FEEL 

BORED  AT  SCHOOL 

L-1 5 

L-1 5 

Ij—  X  D 

S003801 

DURINO  PA8T  TEAR 

HOW  OFTEN  SENT  TO  PRINCIPALS  OFF 

L-16 

L-16 

L—  1  6 

119. 

S003802 

DURING   PAST  TEAR 

HOW  OFTEN  PLACED  ON  PROBATION 

L-17 

L— 1  7 

190. 

S003803 

DURING   PA8T  TEAR 

HOW  OFTEN  GIVEN  DETENTION 

L-1 8 

L— 1 8 

191. 

S003804 

DURING   PA8T  TEAR 

HOW  OFTEN  WARNED  ABOUT  ATTENDANCE 

L-1  7 

L-1 9 

Ij—  X  y 

192. 

S003805 

DURING   PA8T  TEAR 

ROW  OFTEN  WARNED  ABOUT  GRADES 

L-1 8 

L-20 

L— 2  0 

193. 

S003806 

DURING   PA8T  TEAR 

HOW  OFTEN  WARNED  ABOUT  BEHAVIOR 

L-1 9 

L-2 1 

L—  2 1 

194. 

S003901 

HOW 

MART  OLDER  BROTHERS  AND  SISTERS 

M-01 

M-Ol 

M— 0 1 

195. 

S003902 

ROW 

MANT  TOUNQER 

BROTHERS  AND  SISTERS 

M-02 

M-02 

M-0  2 

196. 

S004001 

HOW 

HANT  DATS  OF 

SCHOOL  HISSED  LAST  MONTH 

M-0  3 

M-03 

M-0  3 

197. 

S00410. 

HOW 

HANT  TIMES  LATE   FOR  SCHOOL  LAST  MONTH 

M-04 

M-04 

M-04 

198. 

S004201 

HOW 

OFTEN 

READING 

:    HELPS  ME   DECIDE  WANT  TO  BE 

N-01 

N-01 

N-Ol 

199. 

S004202 

HOW 

OFTEN 

READING 

:    HELPS  ME   LEARN  TO  FIX  THINGS 

N-02 

N-02 

N— 02 

200. 

S004203 

HOW 

OFTEN 

READING 

:    HELPS  UNDERSTAND  PEOPLES  ACTION 

N-03 

N-0  3 

N— 0  3 

201. 

S004204 

HOW 

OFTEN 

READING 

:    READING  IS  IMPORTANT 

N-04 

N-04 

N—  0  4 

202. 

S004205 

HOW 

OFTEN 

READING 

:    BETTER  FEWER  HARD  WORDS 

N-0  5 

N-05 

N— 0  5 

203. 

S004206 

HOW 

OFTEN 

READING 

:    BETTER  FEWER  LONG  SENTENCES 

N-06 

N-06 

N— 0  6 

204. 

S004207 

HOW 

OFTEN 

READING 

:    BETTER  IF   IT  MATTERED  TO  ME 

N->0  7 

N-0  7 

N-0  7 

205. 

S004208 

HOW 

OFTEN 

READING 

:    BETTER  IF  TEACH  GAVE  MORE  TIME 

N-08 

N-0  8 

N— 0  8 

206. 

S004209 

HOW 

OFTEN 

READING 

:    BETTER  IF   DIDNT  HAVE  SO  MUCH 

N-09 

N-09 

N  — fl  Q 

207. 

S004210 

HOW 

OFTEN 

READING 

:    BETTER  IF  WASNT  TESTED  ON  IT 

N-10 

N-10 

N  —  1 0 

208 

S004211 

HOW 

OFTEN 

READING 

:    LIKE  MORE   IF  COULD  TALK  W  OTHER 

N-11 

N-11 

N— 1 1 

209. 

S004301 

ROW 

OFTEN 

DO  TOU 

READ  A  STORT  OR  NOVEL 

O-Ol 

O-Ol 

O-O  1 

210. 

S004302 

HOW 

OFTEN 

DO  TOU 

READ  A  POEM 

O-02 

O-O  2 

O-O  2 

211. 

S004303 

HOW 

OFTEN 

DO  TOU 

READ  A  PLAT 

0-03 

O-03 

O-O  3 

212. 

S004304 

HOW 

OFTEN 

DO  TOU 

READ  A  NEWSPAPER 

O-04 

3-04 

O-04 

213. 

S004305 

HOW 

OFTEN 

DO  TOU 

READ  A  MAGAZINE 

O-05 

O-05 

O-05 

214. 

8004306 

HOW 

OFTEN 

DO  TOU 

READ  A  SCIENCE  BOOK 

O-06 

O-O  6 

O-06 

215. 

8004307 

ROW 

OFTEN 

DO  TOU 

READ  A  BIOGRAPHT 

O-0  7 

O-07 

O-07 

216. 

8004308 

ROW 

OFTEN 

DO  TOU 

READ  A  HOW-TO-DO  BOOK 

O-08 

O-08 

O-08 

217. 

8004309 

HOW 

OFTEN 

DO  TOU 

READ  A  BOOK  ABOUT  OTHER  TIMES 

O-09 

O-09 

O-09 

218. 

8004310 

ROW 

OFTEN 

DO  TOU 

READ  A  SPORTS  BOOK 

O-IO 

O-IO 

O-IO 

219. 

8004311 

ROW 

OFTEN 

DO  TOU 

READ  WORDS  OF  A  SONG 

0-11 

0-11 

0-11 

220. 

S004401 

ROW 

OFTEN 

DOES  SOMEONE  READ  ALOUD  TO  TOU 

P-01 

P-01 

P-01 

221. 

8004402 

HOW 

OFTEN 

DO  TOU 

READ  ALOUD  TO  SOMEONE 

P-02 

P-02 

p-02 

222. 

8004501 

HOW 

OFTEN 

DOES  FAMILT  READ  NEWSPAPERS 

P-03 

P-03 

P-03 

223. 

8004502 

HOW 

OFTEN 

DOES   FAMILT  READ  MAGAZINES 

P-04 

P-04 

P-04 

224. 

8004503 

HOW 

OFTEN 

DOES   FAMILT  READ  BOOKS 

P-05 

P-05 

P-05 

225. 

8004504 

HOW 

OFTEN 

DOES  FAMILT  READ  RECIPES 

P-06 

P-06 

P-06 

Er|c  720 


671 


721 


Table  A(3) 

Bickground  and  Attitude  Iteas  and  Locations  (Spira 


II    NAEP  ID 


DESCRIPTION 


Grade  4/Age  9 


Grade  S/Age  13 


Grade  11/Age  17 


226. 

227. 

228. 

229. 

230. 

231. 

232. 

233. 

234. 

235. 

236. 

237. 

238. 

239. 

240. 

241. 

242. 

243. 

244. 

245. 

246. 

247. 

248  . 

249. 

250  . 

251. 

252. 

253. 

254. 

255. 

256. 

257. 

258. 

259. 

260. 

261. 

262. 

263. 

264. 

265. 

266. 

267. 

268. 

269. 

ERIC 


S004601 

SO046O2 

S004603 

S004701 

S004702 

S004703 

S004801 

S004802 

S004803 

S004804 

S005001 

S005002 

SO05OO3 

SO050O4 

S005005 

SO050O6 

8005007 

8005008 

8005009 

8005010 

8005011 

SO05012 

SO05013 

SO05O14 

8005015 

S005016 

8005017 

8005019 

S005101 

SO051O2 

8005103 

S005104 

8005105 

8005106 

8005201 

8005202 

8005203 

8005301 

8005302 

8005303 

8005304 

SO053O5 

8005401 

S005402 

S'*')5403 


HOW  OFTEN 
HOW  OFTEN 
HOW  OFTEN 
ROW  OFTEN 
HOW  OFTEN 
BOW  OFTEN 
HOW  OFTEN 
HOW  OFTEN 
HOW  OFTEN 
NOW  OFTEN 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
WREN  FREE 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
WREN  FREE 
WREN  FREE 
WREN  FREE 
WHEN  FREE 
WRCN  FREE 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
WHEN  FREE 
ROW  OFTEN 
ROW  OFTEN 
HOW  OFTEN 
ROW  OFTEN 
BOW  OFTEN 
ROW  OFTEN 
ROW  OFTEN 
ROW  OFTEN 
ROW  OFTEN 
ROW  OFTEN 
HOW  OFTEN 
ROW  OFTEN 
HOW  OFTEN 
BOW  OFTEN 
BOW  OFTE*^ 
BOW  OFTEN 
BOW  OFTEN 


WITB  NEW  READING  TEACBER  POINT  BARD  WORD 
WITB  NEW  READING  TEACBER  PREVIEW  READING 
WITH  NEW  READING  TEACHER  READ  PART  ALOUD 
D0E8  TEACBER  LIST  OF  QUESTS  AS  YOU  READ 
DOES  TEACBER  TELL  BOW  TO  FIND  MAIN  IDEA 
DOES  TEACHER  TELL  BOW  TO  READ  FASTER 
WRITING  BELPS  ME  GET  A  GOOD  JOB 
WRITING  BELPS  ME  SBARE  HT  IDEAS 
WRITING  BELPS   SBOW  I   KNOW  THINGS 
WRITING  BELPS   KEEP  IN  TOUCB  FRIEN 
BOW  OFTEN  WATCB  TV 
BOW  OFTEN  READ  A  BOOK 
BOW  OFTEN  WRITE   IN  DIARY 
BOW  OFTEN  CALL  A  FRIEND 
BOW  OFTEN  BE  WITB  FRIENDS 
BOW  OFTEN  GO  SBOPPING 
BOW  OFTEN  PLAY  A  SPORT 
BOW  OFTEN  GO  BUNTING  OR  FISHING 
BOW  OFTEN  TAKE  A  WALK 
BOW  OFTEN  WORK  AT  A  COMPUTER 
BOW  OFTEN  PLAY  VIDEO  GAMES 
BOW  OFTEN  READ  A  NEWSPAPER 
BOW  OFTEN  GET  A  SNACK 
BOW  OFTEN  DO  EXTRA  BOMEWORK 
BOW  OFTEN  WRITE  A  LETTEF 

 ,   BOW  OFTEN  LISTEN  TO  MUSIC 

TIME,   BOW  OFTEN  DO  SOMETBING  ELSE 
TIME  WBAT  ACTIVITY  SPEND  MOST  TIME 
WBEN  STUDY  FOR  TEST:    READ  OVER  MATERIAL 


TRUE: 
TRUE: 
TRUE: 
TRUE: 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 
TIME, 


TAKE  NOTES  ON  READ 
MAKE  OUTLINES 
QUES   IN  TEXTBOOK 
ANSWER  OWN  QUESTNS 
QUESTION  OTHERS 
SCBOOL 


WBEN  STUDY  FOR  TEST: 
WBEN  STUDY  FOR  TE8T: 
WHEN  STUDY  FOR  TEST: 
WHEN  STUDY  FOR  TEST: 
WHEN  STUDY  FOR  TE8T: 
DO  YOU  READ  ALOUD  IN 
DO  YOU  READ  ON  YOUR  OWN  IN  SCHOOL 
DO  YOU  WORK  IN  A  WORKBOOK 
GO  TO  LIBRARY  TO  READ  ON  OWN 
GO  TO  LIBRARY  TO   LOOK  UP  'ACT  FOR  SCHOOL 
GO  TO  LIBRARY  TO  FIND  BOOKS   FOR  HOBBIES 
GO  TO  LIBRARY  FOR  QUIET  PLACE  TO  READ 
GO  TO  LIBRARY  TO  TAKE  OUT  BOOKS 
DO  YOU  WATCH  NEWS   ON  TELEVISION 
DO  YOU  READ  A  NEW8  MAGAZINE 
DO  YOU  READ  NEWSPAPER  NOT  COMICS  OR  SPRT 


Q-01  U-12  V-25  X 

Q-02  U-13  V-26  X 

Q-03  U-14  V-27  X 

Q-04  U-15  X-13 

Q-05  U-16  X-14 

Q-06  U-17  X-15 

R-01 

R-02 

R-03 

R-04 

S-01  W-01 
S-02  W-02 
S-03  W-03 
S-04  W-04 
S-05  W-05 
S-06  W-06 
S-07  W-07 
S-08  W-08 
S-09  W-09 
S-10  W-10 
S-11  W-11 
S-12  W-12 
S-13  W-13 
S-14  W-14 
S-15  W-15 
S-16  W-16 
S-17  W-17 
S-18  W-18 
T-01 
T-02 
T-0  3 
T-04 
T-0  5 
T-06 
T-07 
T-08 
T-09 

T-10  W-32 
T-11  W-33 
T-12  W-34 
T-13  W-35 
T-14  W-36 
T-15  W-19 
T-16  W-20 
T-17  W-21 


■10  Q-01  U-12  V-25  X 
■11  Q-02  U-13  V-26  X 
-12  Q-03  U-14   V-27  X 

Q-04  U-15  X-13 

Q-05  U-16  X-14 

Q-06  U-17  X-15 

R-01 

R-02 

R-0  3 

R-04 

S-01  W-01 
S-02  W-02 
S-03  W-03 
S-04  W-04 
S-05  W-05 
S-06  W-06 
S-07  W-07 
S-08  W-08 
S-09  W-09 
S-10  W-10 
S-11  W-11 
S-12  W-12 
S-13  W-13 
S-14  W-14 
S-15  W-15 
S-16  W-16 
S-17  W-17 
S-18  W-18 
T-01 
T-02 
T-03 
T-04 
T-05 
T-06 
T-07 
T-08 
T-09 

T-10  W-32 
T-11  W-33 
T-12  W-34 
T-13  W-35 
T-14  W-36 
T-15  W-19 
T-16  W-20 
T-17  W-21 


•10  Q-01  U-12  V-25  X-10 
-11   Q-02  U-13  V-26  X-11 
■12   Q-03  U-14  V-27  X-12 
Q-04  U-15  X-13 
Q-05  U-16  X-14 
Q-06  U-17  X-15 
R-01 
R-0  2 
R-03 
R-04 
S-01 
S-02 
S-03 
S-04 
S-05 
S-06 
S-07 
S-08 
S-09 
S-10 
S-11 
S-12 
S-13 
S-14 
S-15 
S-16 
S-17 
S-18 
T-01 
T-02 
T-03 
T-04 
T-05 
T-0  6 
T-07 
T-08 
T-09 

T-10  W-14 
T-11  W-15 
T-12  W-16 
T-13  W-17 
T-14  W-18 
T-15  W-01 
T-16  W-02 
T-17  W-03 


672 


m 


72J 


Tabl* 

Bad-ground  and  Attitud*  i 


HAEP  ID 


DESCRIPTION 


\\'  ""EN  TO  NEWS  OH  RADIO 

loll-, A  ADMISSION  TO  A  COLLEGE   OR  UNV 

«      SolllA  u^ll   IV                     TO  *  'OUR-IEAR  COLLEGE 

S"    122172!  u^ll            ^P"""  TO  A  TWO-TEAR  COLLEGE 

I'    llllA,  APPLIED  TO  OTHER   COLLEGE  OR  UNTVERSITY 

7*    lllllV,  rSi'                      LONG-TERM  CAREER  GOALS 

7.  S005802  LONG-TERM  CAREER  GOAL  CODE 

8.  S005901  WHERE  DID  lOU  LIVE  AT  ACr  13 

9.  S005902  WHERE  DID  lOU  LIVE  AT  AJE     3-  STATE 
0.   S00S903  WHERE  DID  lOU  LIVE  AT  AOS   13:  COONTRI 

2  sS26o2I  SSr^r""                   "  '^"""E'"  H"IDENCE 

3  I22fi222  ^l^u                                 'O"  TAKEN:  GENERAL  SCIENCE 
r    fSSfSSJ  COURSES   HAVE   lOU  TAKEN :  B  lOLOGI 

V    loAnV.  COURSES  RAVE  lOU  TAKEN :  CHEMI STRI 

I'    f22f22i  COURSES  HAVE   lOU  TAKEN :  PHYSICS 

7*    lllim  ^"^^   COURSES   HAVE   lOU  TAKENtOTHER  SCIENCE  il) 

8*    s22l229  Z«rr«  ^^""^                      TAKEN:  OTHER  SCIENCE  5 

!"  f22«?2?  «2t^2  courses  have  iou  taken:other  science  3 

I'  COURSES  HAVE   lOU  TAKEN : GENERAL  MATH  1 

•  lllAll  COURSES  HAVE   lOU  TAKEN : GENERAL  MATH  2 

•  sSS'lSJ  wSJrS  rVllll                      TAKEN:FIRST  YEaS  ALGEBRA 

1      «2««,«i  *   ^      COURSES   HAVE   lOU  TAKEN:SECOND  TEAR  ALGEBRA 

•  f--fj-!  COURSES   RAVE   lOU  TAKEN :  GEOMETRY 

•  COURSES  HAVE   lOU  TAKEN :  CALCULUS 

'    s22fil29  uulnu   """"S               'OU  TAKENtOTHER  MATH   COURSE  (1) 

■   lllllll  Zurru  '^O"""^              »0U  TAKEN:OTHER  MATH  COURSE  2 

'   s22S201  ?Jir                                         taken: OTHER  MATH  COURSE  3 

•  school  in:preparing  students  for  college 

•  lllUll  ""o*"-  i'':preparing  students  for  career 
lllclll  5'^"°°''  i":P"eparing  students  for  life 

IIAIV^  IW-VARIETY  OF   EXTRACUR  ACTIVITIES 

•  AAlll  I"  =  «"ALITI  OF   EXTRACUR  ACTIVITES 

5'^"00I'  IB:  FACULTY   INTEREST  IN  STUDENTS 

•  InVcir.  ^'^"OO''  I"  =  QUALITY  OF  FACULTY 

•  f««<J«?  SCHOOL  IN:QUALITY  OF   STUDENT  LIFE 

.    S006301  SCHOOL   experiences: I  AM  SATSFIED  WITH  MY  PBIir»TT«- 

•  lllWll  "'"""«s:not  leaSn"wSa;""need "o  ;Sow 

•  lllAV.  ""«""CES:HAVE  HAD  DISCIPLNE  PRBS  THIS  YR 

•  llAlli  «"""''CES:I  AM  INTERESTED  IN  SCHOOL 

•  Vnnllnl  EXPERI ENCES :  ONCE   IN  A  WHILE    I   CUT  CLASS 

•  f22«2S  "PE"""CES:I   DON'T  FEEL  SAFE  AT  tSiS  SCKL 

•  f  22122  J  "P"""CES:WISH   I    COULD  GOTO  SCHOOL 

•  lll  All  "^''^                                 "  program:  REMEDIAL  ENGL?sS 

•  lllVAl  "  program:  REMEDIAL  MATH 

.   S006«03  HAVE  YOU  EVER  BEEN   IN  PROGRAM : HONORS  ENGLISH 


^  724 


nd  Locations  (Spiral) 
Grada  4/Aga  9  Grada 


8/Aga  13 


Grada  11/Aga  17 


T-18  w-22 


T-18  W-22 


T-18  W-04 

J-10 

J-10 

J-10 

J-10 

J-11 

J-11 

H-06 

H-06 

H-06 

H-06 

L-22 

L-23 

L-24 

L-25 

L-26 

L-26 

L-26 

N-12 

N-13 

N-14 

N-15 

N-16 

N-17 

R-18 

N-19 

r7-20 

P-07 

P-08 

P-09 

P-10 

P-11 

P-12 

P-13 

P-14 

R-05 

R-06 

R-0  7 

R-08 

R-09 

R-10 

R-11 

V-28 

V-29 

V-30 


725 


Tab: a     A( 3 ) 

Background  and  Attitude  Xfns  and  Locations  (Spiral) 


n 

NAEP  ID 

316. 

S006404 

317. 

S006405 

318. 

S006'.06 

319. 

S006407 

320. 

S006408 

321. 

S006409 

322  . 

S006410 

323  . 

S006501 

324. 

S006502 

325. 

S006503 

326. 

S006504 

327  . 

S006505 

32$. 

S006506 

329. 

S006507 

330  . 

S006508 

331. 

S006509 

332  . 

S006510 

333  . 

S006511 

334  . 

S006512 

335. 

S006513 

336  . 

S006514 

337  . 

S006515 

338  . 

S006516 

339  . 

S006517 

340. 

S006518 

341  . 

S006519 

342  . 

S006601 

343  . 

S006701 

344  . 

S006702 

345  . 

S006703 

346. 

S006704 

347  . 

S006705 

348  . 

S006706 

349. 

S006707 

350. 

S006708 

351. 

S006709 

352  . 

S006801 

DESCRIPTION 


HAVE   YOU  EVER  BEEN  IN  PROGRAM : HONORS  MATHEMATICS 
HAVE   70U  EVER  BEEN  IN  PROGRAM : HONORS  SCIENCE 
HAVE   YOU  EVER  BEEN  IN  PROGRAM : BI LINGUAL  PROGRAM 
HAVE   YOU  EVER  BEEH  IH  PROGRAM : FAMI LY-LI FE , SEX  ED 
HAVE   YOU  EVER  BEEN  IN  PROGRAM : ALCOHL , DRUG-ABUSE  ED 
HAVE   YOU  EVER  BEEN   IN  PROGRAM: SPEC  PHYSICAL  PROGRM 
HAVE   YOU  EVER  BEEN   IN  PROGRAM:SPEC  SPEECH  PROGRAM 
HAVE   YOU  TAKEN  COURSES : AGRICULTURE , INCLD  HORTICULT 
HAVE   YOU  TAKER  COURSES :AUTO  MECHANICS 
HAVE   YOU  TAKEN  COURSES : COMMERCIAL  ARTS 
HAVE   YOU  TAKEN  COURSES : COMPUTER  PROGRAMMING 
HAVE  YOa  TAKEN  COURSES : CONSTRUCTION , CARPENTRY  TRDS 
HAVE   YOU  TAKEN  COURSES : CONSTRUCTION  TRADES : ELECTRL 
HAVE   YOU  TAKEN  COURSES : CONTRUCTION  TRADES : MASONRY 
HAVE   YOU  TAKEN  COURSES : CONSTRUCTION  TRADES : PLUMBWG 
HAVE   YOU  TAKEN  COURSES : COSMETOLOGY , HAIRDRESS ING 
HAVE   YOU  TAKEN  COURSES : DRAfTING 
HAVE   YOU  TAKEN  COURSES : ELECTRONICS 

HAVE  YOU  TAKEN  COURSES: HOME  EC , DIETETICS , CHILD  CAR 
HAVE   YOU  TAKEN  COURSES : MACHINE  SHOP 

HAVE  YOU  TAKEN  COURSES :MEDICAL  OR  DENTAL  ASSISTNT 
HAVE   YOU  TAKEN  COURSES : PRACTICAL  NURSING 
HAVE  YOU  TAKEN  COURSES: POOD  SERVICE  OCCUPATIONS 
HAVE  YOU  TAKEN  COURSES : SALES  OR  MERCHANDISING 
HAVE   YOU  TAKEN  COURSES : SECRETARIAL , OFFICE  WORK 
HAVE  YOU  TAKEN  COURSES : WELDING 

WHAT  TAKE  MOST  OF  YOUR  TIME  YEAR  AFTER  HIGH  SCHOOL 
OTHER  PLANS   FOR  YEAR  AFTER  HIGH  SCHOOL:WORK 

FOR  YEAR  AFTER  HIGH  SCH00L:APPRENTICE 
FOR  YEAR  AFTER  HIGH  SCHOOL : MI LITARY 
FOR  YEAR  AFTER  HIGH   SCHOOL : HOMEMAKER 
FOR  YEAR  AFTER  HIGH  SCHOOL:VOC  SCHOOL 
FOR  YEAR  AFTER  HIGH   SCHOOL:COMM  COLLEG 


Grad*  4/A9e  9 


Grade  S/Age  13 


OTHER  PLANS 
OTHER  PLANS 
OTHER  PLANS 
OTHER  PLANS 

OTHER  PLAN^   .  w..     

OTHER  PLANS  FOR  YEAR  AFTER  HIGH  SCHOOL:VOC  COURSES 
OTHER  PLANS  FOR  YEAR  AFTER  HIGH  SCHOOL :4-YR  COLLEG 
OTHER  PLANS  FOR  YEAR  AFTER  HIGH  SCHOOL:TRAVEL,NONE 
HOW  MUCH   FREE  TIME  ON  AVERAGE   SCHOOL  DAY 


J-11 


Grad«  11/A9«  17 


V-31 

V-32 

V-33 

V-?4 

V-35 

V-36 

V-37 

W-19 

W-20 

W-21 

W-22 

W-2  3 

W-24 

W-25 

W-26 

W-27 

W-28 

W-29 

W-30 

W-31 

W-32 

W-33 

W-34 

W-35 

W-36 

W-37 

y-04 

Y-05 

Y-05 

Y-05 

Y-05 

Y-05 

Y-05 

Y-05 

Y-05 

Y-05 

J-09 


o  726 

ERIC 


674 


727 


APPENDIX  B 
Reading  Trend  Analysis  Items 


728 


r 


Table  B-1 

List  of  It*»T.s  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 

SS 


1 

2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 


7099001 
7099001 
7099002 
7099002 
7099003 
7099004 
7099005 
7099005 
7099005- 
7099006- 
7099006- 
7099007- 
7099007- 
7099007- 
7099008- 
7099009- 
7099009- 
7099011- 
7099012- 
7099013- 
7099013- 
7099013 
7099014 
7099014 
7099014 
7099014 
7099014- 
7099014 
7099014- 
7099014- 
7099014- 
7099014- 
7099014- 
7099014- 
7099016- 
7099016- 
7099016- 
7099017- 
7099017 
7099017 
7099017 
7099017 
7099018 


-AOOl/OOl 
-A002/003 
■AOOl/OOl 
■A002/002 
•AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
-AOOVOOl 
-A002/002 
-A003/003 
-AOOl/OOl 
-AOOl/OOl 
-A002/002 
■AOOl/OOl 
-AOOl/OOl 
■AOOl/OOl 
■A002/002 
•A003/003 
-AOOl/002 
-A003/004 
-A005/006 
-A007/008 
-A009/010 
-AOll/012 
-A013/014 
-A015/016 
-A017/018 
-A019/020 
-A021/022 
-A023/024 
-AOOl/OOl 
•A002/002 
■A003/003 
•AOOl/002 
A003/003 
A004/005 
-A006/007 
-A008/008 
-AOOl/OOl 


N004002 
N004001 
N001801 
N001802 


N004201 
N004202 
N005001 
N005002 
N005003 

N003601 
N003602 

N003501 


TRIANGLE: NAME  FIGURE  AS  TRIANGLE 
TRIANGLE: DRAWING  TRIANGLE 
FLY: WANT  OF  THOUGHT-LACK  OF  THINKING 
FLY: FACING  PROBLEMS  SIMILAR  TO  HIS  OWN 
WAYFARER: FEW  SEEK  TRUTH 

DROPOUT: DROPOUTS  HAVE  HARD  TIME  GETTING  J 
ADM  DRAKE: SENT  PENGUIN 
PENGUIN  CAPT  COOKS  HOME 
PENGUINS  DIFFICULT  PETS 

MEOW-WOW: 2  MONTH  KITTEN-FEED  3  OR  4  TMS  D 
MEOW-WOW:CAT  LEAVES  FOOD-LEAVE  BOWL  FOR  H 
ARTS: BEFORE  1940  ARTS  WERE  ORIENTED  TO  EL 
ARTS: PRIVILEGE  OF  ARISTOCRATIC  FEW-GREAT 
ARTS: MASS  PROD  NO  HARM  TO  GENUINE  ART 
REASONS  FOR  DOG  OVER  CAT 
MAGIC  TRICK: FIRST  TIE  BLACK  THREAD 
MAGIC  TRJCK: DIMLY  LIT  RM,  SAY  PRODUCE  FRO 
CAT  POEM: WORD  PLACEMENT 

TOASTER: DRAGON/TOASTER  QUALITIES  COMPARED 
AD: BEARS  NAME  SMOKEY 
AD: PURPOSE 

AD: TELLS  TO  DROWN  FIRES 

BRIAN  GREEN  APP:NAME 

BRIAN  GREEN  APP:BIRTHDATE 

BRIAN  GREEN  APP: ADDRESS 

BRIAN  GREEN  APP: FATHER 

BRIAN  GREEN  APP: TELEPHONE 

BRIAN  GREEN  APP: BUS  ADD 

BRIAN  GREEN  APP: SCHOOL 

BRIAN  GREEN  APP: GRADE 

BRIAN  GREEN  APP: COUNSELOR 

BRIAN  GREEN  APP: SUBJECTS 
BRIAN  GREEN  APP: FAILED 
BRIAN  GREEN  APP: MISBEHAVE 
TABLE  CONTENTS: MOVIE  REV 
TABLE  CONTENTS: SCIENCE 
TABLE  CONTENTS: ARTICLE 
TV  GUIDE: RERUN 
TV  GUIDE: BOTH  MOVIE  6,  ZOO 
TV  GUIDE: NO  NEW  PROG  AT  3  ON  4 
TV  GUIDE :TIME  OF  CARTOONS 
TV  JUIDE: LENGTH  OF  PROG  ON  6 
INTER  VOYAGE:  REPT  TRAVEL 


677 


ERIC 


729 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  bkills) 


Type    No.      ECS  ID  ETS  ID  DESCRIPTION 


SS 

SS 

SS 

SS 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

K 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


44  7099018 

45  7099018 

46  7099019 

47  7099019 

48  7099020 

49  7099020- 

50  7099020- 

51  7099021- 

52  7099021- 

53  7099021- 

54  7099021- 

55  7099022- 

56  7099022- 

57  7099023- 

58  7099023- 

59  7099023- 

60  7099024- 

61  7099025 

62  7099026 

63  7099026 

64  7099026 

65  7099027 

66  7101007- 

67  7101009- 

68  7101017- 

69  7101019- 

70  7101019- 

71  7101019- 

72  7101019- 

73  7101019- 

74  7101019- 

75  7101019- 

76  7101055- 

77  7101056- 

78  7101057 

79  7101058 

80  7101059 

81  7101060 

82  7101061 

83  7101062 

84  7101063 


85  7101064- 


A002/002 
.A003/003 
•AOOl/OOl 
.A002/002 
-AOOl/OOl 
.A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-AOOl/OOl 
-A002/002 
-A003/003 
-AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
.A002/002 
■A003/003 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/002 
-A003/00A 
-A005/006 
-A007/008 
-A009/010 
-AOll/012 
-A0i3/014 
-AOOi/001 
-AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 


INTER  VOYAGE:  OBERTri  DIE 
INTER  VOYAGE:  TRANSLATED 
AUTO: INS  MAX  AMT  MED  BILL 
AUTO: INS  MAX  AMT  INJURED 
TV  GUIDE  PART: MYSTERY 
TV  GUIDE  PART: AFTERNOON 
TV  GUIDE  PART: BOB  JOHNSON 
BEAT  PARA:PT  TO  DEFINE 
BEAT  PARA: IN  COLL  ESSAYS 
BEAT  PARA: 'FINE,  NEGLECTED' 
BEAT  F.\RA: ORIGINS  OBSCURE 
SUBURBANITES: ABYSS  DEBTS 
SUBURBANITES: SECURE  DEBTS 
NAY0N:GEOG  FACTORS 
NAYON:BY  1948  DEPENDENT 
NAYON:WHY  SEPARATED 
N013701  OLD  MAN: STORY  TELLS  HOW  MAN  LOOKS 
BOOK  NOT  ALL  ABOUT  PEOPLE 
INC  TAX  FORM: SINGLE  IF  DIVORCED  1- 
INC  TAX  FORM: NOT  FILE  JNT 
INC  TAX  FORM: JNT  1967 
CORP  KINDNESS  PERSONAL 
COMPOUND  WORD  CLASSROOM 
MICROSCOPE  USED  FOR 
PHEASANT  MEANS  GAME  BIRD 
WORDS: MEAN  ABATE 
WORDS: MEAN  VEHEMENTLY 
WORDS: MEAN  INCORRIGIBLE 
WORDS: MEAN  MOROSELY 
WORDS: MEAN  PROFICIENT 
WORDS: MEAN  FURTIVE 
WORDS: MEAN  INNUMBERABLE 
N004101  NONSENSE  WORD  1:KAG-FIRE 
N014001  NONSENSE  WORD  2: TUP-PAPER 
NONSENSE  WORD:TRATS  SHOES 
NONSENSE  WORD:CAGS  HANDS 
N009101  NONSENSE  WORD  3:HABBIES-D0GS 
NONSENSE  WORD:ZUP  WATER 
NONSENSE  WORD: MARTS 
LUNCH  DOOR  CAFETERIA 
PRINCIPALS  DOOR  PICTURE 
SIGN  BUS  STOP 


79 


678 


730 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type   No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


86 
87 
88 
89 
90 
91 
92 
93 
94 
95 
96 
97 
98 
99 
100 
101 
1C2 
103 
104 
lOS 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 
122 
123 
124 
125 
126 
127 


7102001 
7102004 
7102005 
7102006 
7102007 
7102008 
7102010 
7102011 
7102013 
7102014 
7102015 
7102029 
7102030 
7102031 
7102032 
7102032- 
7102032 
7102032- 
7102032- 
7102033- 
7102034- 
7102035- 
7102036- 
7102037- 
7102037- 
7102038- 
7103002- 
7103004- 
7103012- 
7103017- 
7103019- 
7103020- 
7103021- 
7103025- 
7103026- 
7103027- 
7103028- 
7103029- 
7103030-, 
7103031 
7103032- 
7103033-, 


AOO 1/001 
AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
■AOOl/OOl 
-AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
-AOOl/OOl 
•A002/002 
-AOOl/OOl 
-AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 


N009401 
N014101 


NU10301 
N005101 
N010701 
N015201 


N008701 


N002702 


N003901 
N003401 
NOOllOl 


DUAL: WORD  BAT-2  MEANINGS  FOOLED  NELL 
SENTENCE  1:M0ST  SENSE-BLEW  HOUSE  DOWN 
SENTENCE  2: MOST  SENSE-GIRL  WALKED  TO  THE 
MOST  SENSE  BOY  WANTED 
DOG  ON  LEASH  HAS  SPOTS 

SNOWMAN: BEST  DESCRIPTION-SOMEONE  MADE  SNO 
DRAWING: WINNIE  SHORTER  THAN  PAMELA-BEST  S 
SENTENCE  3: MOST  SENSE-BALL  ROLLED  DOWN  TH 
PEOPLE  LEARN  TO  READ:  IN  SCHOOL 
IM  GOING  TO  THAT  MOVIE 
SENTENCE  THAT  ASKS  QUES 

PICTURE: DOG  LYING  ON  TOP  DOGHOUSE-BEST  DE 

SIGN  FOR  BICYCLISTS 

QUIET  SIGN  HANGING  DOOR 

T/F: CHILDREN  ARE  HORSES 

T/F: CRAYONS  OF  BRiCKS 

T/F: PENCILS  FOR  WRITING 

T/F: SUN  MAKES  YOU  COLD 

T/F: TOUCH  EAR  WITH  TONGUE 

SIGNS  WALKING  PERMITTED 

GHOST  STORY: MOON  IS  FLASHLT 

ATMOSPHERE: SCIENTISTS  KNOW  MOST  ABOUT  TRO 

AUTO  WRECK: WINGS  TURNS 

SHAKESPEARE: DEAF  HEAVEN 

SHAKESPEARE: LOVE  SAVES 

AUTO  WRECK: TERRIBLE  CARGO 

WILLY  WORM  1: STORY  ABOUT  A  HUNGRY  WORM 

EASTER  EGGS  IN  PAST  TITLE 

GOTROCKSzWENT  MT  EVEREST 

SPORTS  CAR  TURNS  CORNERS 

BIRDS: CRY  LIKE  MOUSE  WHEN  ANGRY 

SELFISH  PERSON: DESCRIPTION  IN  PASSAGE 

YOUNG  GARDENERS: IN  CENTRAL  PARK-BEST 

PICTURE: CEREAL  WITH  TOY  INSIDE  IS  PAX 

WHICH  BUBBLE  GUM  SWEET 

ZOO  SIGN  DANGEROUS  ANIMAL 

SIGN  FOR  PEDESTRIANS 

CAT  POEM: BUTTONS  SCATTER 

ENG  MUFFINS: BAKING  TIME 

SCARLET  FEVER: HOW  FEEL 

MT  EVEREST: 2  HEIGHTS 

COLORADO  MOUNTAINS  PASS 


679 


731 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R-Reading,  SS=Study  Skills) 


Type    No.      ECS  ID  ETS  ID  DESCRIPTION 


R  128  7103034-AOOl/OOl  WIND  BOAT  STORMY  DAY  SEA 

R  129  7103035-AOOl/OOl  H  KELLER: ACCOMPLISHMENTS 

R  130  7103036-AOOl/OOl  GREG  GOTROCKS : SENT  TO  NEPAL  TO  CHECK  ON  0 

R  131  7103037-AOOl/OOl  PLRSON  LIKES  SPY  STORIES 

R  132  7103038-AOOl/OOl  INCOME  TAX:MAX  FOR  SEPARATE  IS  $500.  EACH 

R  133  7103039-AOOl/OOl  MENTAL  RETARD: AD  PURPOSE-ENCOURAGE  HIRING 

R  134  7103041-AOOl/OOl  KOU  COUPON: APPEALS  TO  EVERYONE 

R  135  7103041-A002/002  N001301  KOU  COUPON: GOOD  FOR  ANY  SIZE  CARBON 

R  136  7103041-A003/003  N001302  KOU  COUPON:USE  ON  NOV.  10,  1970 

R  137  7103041-A004/004  N001303  KOU  COUPON: PAYMENT  IS  12  CENTS 

R  138  7103042-AOOl/OOl  9  OUT  OF  10  AMERS  DEBT 

R  139  7103042-A002/002  INCOME  INCREASED  50X 

R  140  7103043-AOOl/OOl  ENG  MUFFINS: BAKED  GRIDDLE 

R  141  7103044-AOOl/OOl  N001701  BOOK  CLUB: SHIPPING  COSTS  HIGHER  IN  CANADA 

R  142  7103044-A002/002  N001702  BOOK  CLUB: SEND  NO  MONEY  TILL  BILLED 

R  143  7103044-A003/003  N001703  BOOK  CLUB:BUY  6  MORE 

R  144  7103045-AOOl/OOl  N005201  TRAFFIC: APPEAR  IN  COURT  TO  PLEAD  NOT  GUIL 

R  145  7103045-A002/002  N005202  TRAFFIC :FINE-$3. 00 

R  146  7103045-A003/003  N005203  TRAFFIC: PAY  FINE  BY  THURS,  JUNE  11 

R  147  7103046-AOOl/OOl  FILM  NOTICE: DAMAGE  REPL 

R  148  7103046-A002/002  FILM  NOTICE: COLOR  CHANGES 

R  149  7103047-AOOl/OOl  H  KELLER: WHEN  LOST  SIGHT 

R  150  7103048-AOOl/OOl  SILKY  1:PUYED  INSTRUMENTS 

h  151  7103049-AOOl/OOl  FARMER  BROWN : FARMERS  KNOW 

R  152  7103049-A002/002  FARMER  BROWN: WRITER'S  IDEA 

R  153  7103050-AOOl/OOl  MARTIAN  POLAR  CAPS: DISCOVERED  MORE  THAN  2 

R  154  7103051-AOOl/rOl  BUG  SPRAYrNOT  KILL  FLIES 

R  155  7103051-A002/002  BUG  SPRAY:HOLD  10  INCHES 

R  156  7103052-AOOl/OOl  AUTO  WRECK: PEOPLE  DEAD 

R  157  7103053-AOOl/OOl  SILKY: DIDN'T  LIKE  RAIN 

R  158  7103054-AOOl/OOl  WIND  BOAT  WEATHER  WAS  WET 

R  159  7103055-AOOl/OOl  SKIING:NO  ACCOMMADATIONS 

R  160  7103056- AOOl/OOl  WIND  BOAT  »  OF  PEOPLE  2 

R  161  7103057-AOOl/OOl  CAT/BIRD  COMIC: POINT 

R  162  7103058-AOOl/OOl  H  KELLER: EXTENT  LECT  TOURS 

R  163  7103059-AOOl/OOl  FRANGIBLES: COMMUNICATE 

R  164  7103060-AOOl/OOl  SCARLET  FEVER:OTHER  INFEC 

R  165  7103061-AOOl/OOl  HOW  SPORTS  CARS  DIFFER 

R  166  7103062-AOOl/OOl  POISON  IVY: WASH  TO  AVOID 

R  167  7103062-A002/002  CAUMINE  LOTION  SOOTHES 

R  168  7103062-A003/003  BORIC  ACID  FOR  EYELIDS 

R  169  7127001-AOOl/OOl  N003801  SCOTT:BEST  TITLE-SCOTT'S  PLAN 


680 


732 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R-Reading,  SS»Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R  170  7127001-A002/002  N003802 

R  171  7127001-A003/003  N003803 

R  172  7127002-AOOl/OOl 

R  173  7127002-A002/002 

R  174  7127003-AOOl/OOl  N002101 

R  175  7127003-A002/002  N002102 

R  176  7127004-AOOl/OOl 

R  177  7127004-A002/002 

R  178  7127005-AOOl/OOl 

R  179  7127005-A002/002 

R  180  7127006-AOOl/OOl 

R  181  7127007-AOOl/OOl 

R  182  7127009-A001/002 

R  183  7127009-A003/004 

R  184  7127009-A005/005 

R  185  7127009-A006/006 

R  186  7127009-A007/007 

R  187  7127009-A008/008 

R  188  7127009-A009/009 

R  189  7201002-AOOl/OOl 

R  190  7201003-AOOl/OOl 

R  191  7201013-A002/003 

R  192  7201014-AOOl/OOl 

K  193  7201023-AOOl/OOl 

R  194  7201024-AOOl/OOl 

R  195  7201025-AOOl/OOl 

R  196  7202003-AOOl/OOl  N002701 

R  197  7202008-AOOl/OOl 

R  198  7203002-AOOl/OOl 

R  199  7203003-AOOl/OOl 

R  200  7203006-AOOl/OOl 

R  201  7203009-A001/001 

R  202  7203009AA001/001 

R  203  7203010-AOOl/OOl 

R  204  7203011-AOOl/OOl 

R  205  7203012-AOOl/OOl 

R  206  7203013-AOOl/OOl 

R  207  7203043-AOOl/OOl 

R  208  7203044-AOOl/OOl 

R  209  7203045-AOOl/OOl 

R  210  7203046-AOOl/OOl 

SS  211  7203047-AOOl/OOl  Nnil701 


SCOTT: 6  WEEKS  BEn'EEN  DEPOTS 
SCOTT: CACHE- PUCE  FOR  STORING  THINGS 
SLEEKY: HOW  MANY  OTTERS 
SLEEKY :VORD  CHATTER  MEAN 
VIRUSiSS: DIFFICULT  TO  STUDY 
VIRUSES: CLOTHE  IDEA-GIVE  PR'^OF  TO  SUPPORT 
SUBURBANITES: SELF  ENTRAP 
SUBURBANITES: BUDGETISM 
FARMER  BROVN:MAIN  IDEA 
FARMER  BROWN: CHANGE  ENVIR 
PERSIAN  GULF  OY^T^S 

SOCIAL  SCI: WRONG  TO  NEGLECT  BEHAV  &  CULT 
TAA  HOSTESS: COMPANY 
TAA  HOSTESS: JOB 
TAA  HOSTESS: QUALIFICATION  1 
TAA  HOSTESS: QUALIFICATION  2 
TAA  HOSTESS: TOP  SAURY 
TAA  HOSTESS: HOW  APPLY 
TAA  HOSTESS :EOE 
AMOS  ANT: WENT  TO  PARK  FIRST 
WIND  BOAT  PUSH  FIRST  WENT  SEA 
SEQUENCE  CARTOONS 
SEA  FEVER: POET  ASKS  FOR 
H  KELLER: WHEN  STUDY  PROBS 
ENG  MUFFINS: 4  INGREDIENTS 
SCARLET  FEVER: OTHER  DISEA 
ATMOSPHERE: 4  WORDS  CUE-FI1ST, NEXT, ABOVE, F 
EVENTS: BEFORE  MEETING  WENT  TO  CONF  ROOM 
GHOST  STORY: MOOD  FRIGHT 
GHOST  STORY: ADD  MYSTERY 
H  KELLER: IN  CHRONOLOGICAL 
ANGRY: TONE  BEST  DESCRIBED  AS  ANGRY 
CANNOT  TOLERATE  ANGRY 
TURTLE  POEM:UNUSU  PT  OF  VIEW 
FLIES  EXAGGERATING  SIZE 
SENTENCE  TONE  SATIRICAL 
TURTLE  POEM: QUICK  GUMPS 
GHOST  STORY: WIND  SOUNDS 

FLIES: AUTHOR  WANTS  YOU  TO  THINK  IT'S  FUNN 
FISH  WALK  TO  MAKE  UUGH 
SKIING: LOVE  OF 

WHICH  WORD  COMES  FIRST  IN  DICTIONARY-  FLE 


681 


73J 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type 


No.      ECS  ID  ETS  10 


DESCRIPTION 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


212 
213 
21A 
215 
216 
217 
218 
219 
220 
2Z1 
222 
223 
22A 
225 
226 
227 
228 
229 
230 
231 
232 
233 
23A 
235 
236 
237 
238 
239 
2A0 
2A1 
2A2 
2A3 
2AA 
2A5 
2A6 
2A7 
2A8 
2A9 
250 
251 
252 
253 


72030A8- 
72030A9- 
7203050 
7203051 
7203052 
7203053 
720305A 
7227001- 
7301002- 
730100A- 
7301006- 
7301007- 
7301011- 
7301012- 
730101A- 
7301019- 
7301019- 
7301020- 
7301020- 
7301020 
7301020 
7301020 
7301020 
7301020 
7301022 
7301022 
7301022- 
7301022- 
7301027- 
7301027- 
7301027- 
7301071- 
7301071- 
7301071- 
7302001- 
730200? 
7302002- 
7302002 
7302002 
730200A 
730200A 
7302005 


AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl  N009601 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl  N01A201 
-AOOl/OOl  N012901 
-AOOl/OOl 
-AOOl/002 
-A003/00A 
-AOOl/002 
-A003/00A 
A005/006 
A007/008 
■A009/010 
•AOll/012 
■A013/01A 
-AOOl/002 
-A003/00A 
-A005/006 
-A009/010 
-AOOl/002 
-A003/00A 
-A005/006 
-AOOl/OOl  N01A501 
-A002/002  N01A502 
-A003/003  N01A503 
-AOOl/OOl 
-AOOl/002 
-A003/00A 
-A005/006 
-A007/008 
AOOl/OOl 
■A002/002 
■AOOl/OOl 


TURTLE  POEM: CONTENTED 
STATIC  CULTURE  EARNEST 
FISH  WALK  FUNNY  STORY 
WIND  WHISTLED  SOUND  PAIR 
CAT/BIRD  COMIC: TONE 
SPEAKER  ATTITUDE  EXASP 
GARLIC  SENTENCES  5  AND  6 
FRANGIBLES:MAIN  PURPOSE 
TIMOTHY: TIME  OF  YEAR 
TIMOTHY  1: SITTING  ON  STEPS 
TIMOTHY: MEN  WASHING  CARS 
TIMOTHY: GIRLS  JUMP  ROPE 
TIMOTHY  2: TEENAGERS  STANDING  IN  CIRCLES 
TIMOTHY  3: TEENAGERS  TALKING  ABOUT  HEAT 
TIMOTHY: WORKMEN  TEARING 
J  DOUGLAS: 3  WOMEN  IN  ROOM 
J  DOUGLAS: RUNNING  AWAY 
LONE  DOG: 5  WORDS(l) 
LONE  DOG: 5  W0RDS(2) 
LONE  DOG: 5  W0RDS(3) 
LONE  DOG: 5  WORDS(A) 
LONE  DOG: 5  W0RDS(5) 
LONE  DOG: 2  THINGS  DOES(l) 
LONE  DOG: 2  THINGS  D0ES(2) 
ZEKE: PLACES  LIVED  2 
ZEKE: LIVES  NOW  HARLEM 
ZEKE: HOUSE  BROWNSTONE 
ZEKb: TOPMOST  FLOOR 
J  DOUGLAS: CITY  BROOKLYN 
J  DOUGLAS: MONTH  NOVEMBER 
J  DOUGLAS: DAY  MONDAY 
CONNECT  DOTS: ALONG  LINE,  CONNECT  DOTS 
CONNECT  DOTS: DRAW  LINE  TO  TOUCH  CIRCLES 
CONNECT  DOTS: WRITE  3  IN  EACH  CIRCLE 
OVAL: FILL  IN  OVAL  BELOW  LETTER  E 
CONNECT  DOTS  SOLID  LINE  A 
wRIib;  WORD  CAT  ON  LINE 
LINE  TO  CONNECT  2  AND  7 
CONNECT  DOTS  SOLID  LINE  D 
EVER  VISITED  MOON 
NEVER  VISITED  MOON 
FIGURE  MADE  WI'iH  3  LINES 


682 


ERIC 


734 


Table  B-1 
(continued) 

St  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Stuc'y  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 


254 
255 
256 
257 
258 
259 
260 
261 
262 
263 
264 


SS  265 

SS  266 

SS  267 

R  268 

SS  269 

SS  270 

SS  271 

SS  272 

SS  273 

SS  274 

SS  275 

SS  276 

SS  277 

SS  278 

R  279 

SS  280 

R  281 

R  282 

SS  283 

SS  284 

SS  285 

SS  286 

SS  287 

R  288 

R  289 

SS  290 

R  291 

R  292 

R  293 

R  294 

SS  295 


7302008 
7302009 
7302009 
7302009 
7302009 
7302009 
7302012 
7302012- 
7302012- 
7302012- 
7302012- 
7302213- 
7303002- 
7303003- 
7303C04- 
7303005- 
7303006- 
7303007- 
7303008- 
7303010- 
7303010- 
7303010- 
7303010- 
7303010- 
7303012 
;-.03013 
7303014 
7303017 
7303017 
7303018 
7303018 
7303018 
7303018- 
7303C18- 
7303019- 
7303019- 
7303023- 
7303026- 
7303026- 
7303026- 
7303026- 
7303027- 


■AOOl/002 
AOOl/002 
A003/004 
-A005/006 
-A007/008 
-A009/010 
-AOOl/002 
-A003/004 
-A005/006 
-A007/008 
-A009/010 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
■A002/002 
■A003/003 
•A004/004 
A005/005 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-A002/002 
-AOOl/002 
-A003/004 
-A005/006 
-A007/008 
-A009/009 
-AOOl/OOl 
-A002/002 
-AOOl/OOl 
-AOOl/OOl 
A002/002 
A003/004 
•A005/005 
AOOl/OOl 


N012101 
N006501 
N012601 


N011801 


N012001 
N012301 

N004501 
N004')02 
N006901 
N006902 
N006903 


N001201 
N001202 


N006101 


WRITE  ZERO  NOT  YOUR  AGE 
DRAW  HORIZONTAL  LINE 
DRAW  2  CIRCLES 
DRAW  ANOTHER  CIRCLE  ABOVE 
CONNECT  CENTERS  OF  3 
DRAW  VERTICAL  LINE 
SHAPES: 3  IN  LARGE  CIRCLE 
SHAPES: 2  IN  SMALL  SQUARE 
SHAPES: 7  IN  LARGE  TRIAN 
SHAPES: 4  IN  SMALL  CIRCLE 
SHAPES: 5  IN  LARGE  SQUARE 
CODE: WHAT  DOES  HPPE  ACTUALLY  SPELL-GOOD 
FIND  GUIDE: OPTIONAL  BtlWEEN  OPPRESS-ORACL 
FIND: BEST  PLACE  FIND  ROTOR-DICTIONARY 
DOG  FOOD  LABELS  PROTEIN 
ENCYCLOPEDIAS  1:EGGS  IN  VOL  3 
ESKIMOS  LOOK  IN  INDEX 
ENCYCLOPEDIAS  2: WASHINGTON  IN  VOL  11 
ENCYCLOPEDIA : WINDM±LLS 
MAP:NORTHTOWN  CLOSER  TO  HOPE 
MAP: CAN  DRIVE  TO  FALLS  CITY 
MAP: HOPE  CLOSEST  TO  CENTERVL 
MAP:CENTERVILLE  FARTHER  WEST 
MAP:HWY  20  S  OF  RIVER 

FIND: DECLARATION  OF  INDEPENDENCE  IN  ENCYC 

PIC: 3  PARTS  MUSHROOM-CAP, STEM, GILLS 

JONES  PHONE  NUMBER 

AREA  CODE:INFO  NY-1-212-555-1212 

AREA  CODE : SYRACUSE-1-3 15-255-601 1 

NEWS: TV  SCHEDULE-PG  22 

NEWS: WEATHER  FORECAST- PG  12 

NEWS: STOCK  AVERAGES-PGS  29-31 

NEWS: QUESTION  **NOT  -SCORED** 

NEWS: BRIDGE  INFO  GIVEN 

LONG  DI ST: RATE  ON  CALL-LOWER  EVENING  RATE 

LONG  DIST: PERSON  CALLS  DIFF-OPR  ASSISTED 

FATAL  ACCIDENTS  2  TO  3AM 

HELP  WANTED: AD  HOURS  AM 

HELP  WANTED: AD  HOURS  PM 

HELP  WANTED: AD  AGE  REQ 

HELP  WANTED: AD  SALARY 

WIND  SYMBOLS: FOR  35  KNOTS-SYMBOL  3 


683 


735 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


SS  296 
SS  297 
SS  298 
SS  299 
SS  300 
SS  301 
SS  302 
SS  303 
SS  304 
SS  305 
SS  306 
SS  307 
SS  308 
SS  309 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 


310 
311 
312 
313 
314 
315 
316 
317 
318 
319 
320 
321 
322 
323 
324 
325 
SS  326 
SS  327 
SS  328 
SS  329 
SS  330 
SS  331 
SS  332 
SS  333 
SS  334 
SS  335 
R  336 
R  337 


7303030- 
7303031- 
7303033- 
7303034- 
7303034- 
7303035- 
7303035- 
7303037- 
7303037- 
7303042- 
7303043- 
7303044- 
7303045- 
7303050- 
7303051 
7303051 
7303051 
7303052 
7303054 
7303054 
7303055 
7303055 
7303056 
7303056 
7303056 
7303056 
7303057 
7303057 
7303057 
7303057 
7303058 
7303059 
7327001 
7327001 
7327001 
7327001 
7327001 
7327001 
7327001 
7327001 
7401001 
7401003 


AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
A001/002 
A003/003 
AOOl/OOl 
A002/002 
AOOl/OOl 
A0O2/002 
•AOOl/OOl 
•AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
•AOOl/OOl 
-AOOl/OOl 
-A002/002 
•A003/003 
■AOOl/OOl 
-AOOl/OOl 
-A002/002 
-A002/002 
-A003/003 
-AOOl/002 
-A003/004 
-A005/006 
-A007/008 
-AOOl/002 
-A003/004 
-A005/006 
-A007/008 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-A006/006 
-AOO"'  '007 
A008/009 
-AOOl/OOl 
-AOOl/OOl 


N005901 
N005902 
N006801 
N006802 
N006301 
N006302 
N006701 


N005801 
N002201 
N002202 
N002203 


REPORT  CARD: PERIOD 

REPORT  CARD: IMPROVE  SCI 

REPORT  CARD: ALGEBRA  PROB 

CARDCAT:CALL  NUMBER- WRITE-IN  ANSWER 

CARDCAT: PICTURES  INDIC  BY  "ILLUS" 

MAP: SPANISH  IN  SOUTH 

MAP:GRP  IN  ALASKa-NOT  ENOUGH  INFO 

CLOTHES  SIZES: SHOE  SIZE  8-40-1 

CLOTHES  SIZES: 38  SWEATER-44 

SCI  INDEX: WOLVES  FIRST  IN  BOOK 

REPORT  CARD: FOREIGN  LANG 

ACC  CHART: INCONCLUSIVE 

SCI  INDEX: EARTHWORMS  INFO 

ENGLISH  DTC:BOOK  TELLS  WORD  MEANINGS-DICT 

PHONE  BILL: FEB  14  CALL  FROM  ATHENS,  GA 

PHONE  BILL: FEB  14  CALL  TO  ST  PAUL,  MN 

PHONE  BILL: FEB  14  CALL  COST  $.75 

CLOCK  BIG  HAND  BET  12  &  1 

FISHING; METHOD  NOT  PERMITTED-USE  MORE  THA 

FISHING: FOR  MULLET-CAN  USE  ALL  METHODS 

NUCLEAR  BURSTS: IMM  DANGER 

NUCLEAR  BURSTS: SKIN  BURNS 

WIN-EM-ALL: DEALER  CHOSEN 

WIN-EM-ALL: ADULTS  &  CHILD 

WIN-EM-ALL: NO  MORE  CARDS 

WIN-EM-ALL: 1ST  PLAYER 

WIN-EM-ALL: DEALS  FIRST 

WIN-EM-ALL: HOW  MANY  PLAY 

WIN-EM-ALL:TIE  TRICK 

WIN-EM-ALL: WINNER 

ACCIDENT  STATE  BEY  FACTS 

SCHEDULE  OTHER  CHOICES  GO 

ST  PAUL: CALL  LAKEVILLE  CHARGE 

ST  PAUL: CALL  MAPLE  PLAIN  CHARGE 

ST  PAUL: CALL  MINNEAPOLIS  NO  CHRG 

ST  PAUL: CALL  NORTH  AREA  NO  CHRG 

ST  PAUL: CALL  SHAKOPEE  CHARGE 

ST  PAUL: CALL  SOUTHWEST  AREA  NO  CHRG 

ST  PAUL: CALL  WHITE  BEAR  LAKE  NO  CHRG 

ST  PAUL:CALL  WHAT  AREA  533-0221-N0THWST(H 

SILKY  SPIDER  2: SILKY  WAS  HUGE-BEST  DESCKI 

TURTLE  POEM: SPEAKING 


684 


736 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


338 
339 
340 
341 
342 
343 
344 
345 
346 
347 
348 
349 
350 
351 
352 
353 
354 
355 
356 
357 
358 
359 
360 
361 
362 
:ic3 
364 
365 
366 
367 
368 
369 
370 
371 
372 
373 
374 
375 
376 
377 
378 
379 


7401005 
7401007 
7401010 
7401011 
7401016 
7401019 
7401022 
7401024- 
7401030- 
7401032- 
7401066- 
7401067- 
7401067- 
7401067- 
7401068- 
7401069- 
7401070- 
7401071- 
7401072- 
7401073- 
7401074- 
7401075- 
7401076- 
7401077- 
7401078- 
7401079- 
7401080 
7401081 
7401082 
7401083 
7401084 
7401085 
7401086 
7402020- 
7402021- 
7402022- 
7402023- 
7403007- 
7403018- 
7403019- 
7502012- 
7503001- 


-AOOl/001 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
•AOOl/OOl 
•A002/002 
AO03/003 
AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
-AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
■AOOl/OOl 
•AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 


N003001 
N003002 
N003003 
N008801 


N010201 
N013301 
N009901 
N001401 


SILKYS  WEB  VERY  BIG 
H  KELLER: SULLIVAN  METHOD 
SILKY: LIKES  BEAN  SOUP 
WILLY  2: ATE  APPLE 
N004801  SILKY  3: WISHED  MORE  HAIR 
SILKY: LIKED  CUDDLES 
SILKY: FLIES  AS  PLAYMATES 
BEST  TITLE: A  TASTE  FOR  READING 
FRANGIBLES: SEEMS  FALSE 
HORSEPOWER  WITHOUT  SENSE 
FISH  PICTURE  ABOUT  TO  EAT 
SUPR  COURT: CONSTITUTION  DESCRIPTION-BRIEF 
SUPR  COURT: DIFFICULT  RESPON  FOR  COURT  MEM 
SUPR  COURT: "THEIR"  REFERS  TO  PROVISIONS 
YVONNE'S  DOLL: COULD  'T  FIND-UNDER  PORCH 
FINISH  2ND  STORY  LIKE  1ST 
CAT/BIRD  COMIC: BIRD  WOULD  SAY 
DESCRIPTION  1: CLOWN  DESCRIBED  IN  PASSAGE 
DESCRIPTION  2: UNHAPPY  PERSON  DESCRIBED  IN 
DESCRIPTION  3: PERSON  HAS  SEEN  TOY  MANY  TI 
VERSE: DECK  OF  CARDS  DESCRIBED  IN  POEM 
VERSE: CLOCK 
VERSE: FLAG 
VERSE: EYEGLASSES 
TOMMY  AND  SAMMY  FIGHT 
FRANGIBLES: ENTER  OBJECT 
STATIC  CULTURE  ATTITUDES 
WIND  BOAT  HELP  NOW  RESCUE 
CHRISTMAS  NEAR  COATS 
POEM: UNSURE  ATTITUDE 
N011201  DOGS'  QUAL:BITTEN  BY  DOG,  DISAGREE 
CHRISTMAS  SHOPPING  LAST 
CHRISTMAS  STORY  DEC  21 
N002401  MOSQUITO: SIZE  MOSQUITOES  EXAGGERATED 
N009201  PUZZLE  1:BIRD  DESCRIBED  IN  PUZZLE 
PUZZLE  2: WORM  DESCRIBED  IN  PUZZLE 
N009801  PUZZLE  3: CHAIR  DESCRIBED  IN  PUZZLE 
N004901  COLORADO: GOLD  DISCOVERY  DOESN'T  BELONG 

BERT  &  ART  NOT  BOTH  RIGHT 
N002501  MARY: WILL  GET  MONEY  FROM  NEITHER 

AMOS  ANT: MAKE-BELIEVE 
NOlllOl  KIND  OF  BK: ATMOSPHERE  FROM  SCIENCE  BOOK 


685 


737 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 
R 


380 
381 


SS  382 

SS  383 

SS  384 

R  385 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


386 

387 

388 

389 

390 

391 

392 

393 

394 

395 

396 

39> 

398 

399 

400 

401 

402 

403 

404 

405 

406 

407 

408 

409 

410 

411 

412 

413 

414 

415 

416 

417 

418 

419 

420 

421 


7503004 
7503005 
7503009- 
7503009- 
7503009- 
7503044-, 
7503045- 
H201000- 
H201000- 
H201000- 
H202000- 
H202000- 
H202000- 
H202000- 
H204000- 
H204000- 
H204000- 
H204000- 
H205000- 
H205000- 
H205000- 
H205000- 
H206000- 
H206000- 
H222000- 
H222000- 
H222000- 
H222000- 
H222000 
H224000 
H224000 
H224000 
H225000 
H225000 
H22500O 
H225000 
H225000 
H24iOOO 
H24100O 
H241000 
H2A3000 
H243000 


AOOl/OOl 
AOOl/OOl 
AOOl/002 
A00?/004 
A005/005 
AOOl/OOl 
AOOl/OOl 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
A004/004 
A002/002 
A003/003 
A004/004 
•A005/005 
AOOl/OOl 
■A002/002 
■A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
■AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-AOOl/OOl 
-A002/002 
A003/003 
-A002/002 
-A003/003 


N003301 
N008601 
N008602 
N008603 


N010902 
N010903 
N010904 

N010501 
N010502 
N010503 
N010504 
N011301 
N011302 
N001601 
N001602 

N001603 
N001604 


N004401 
N004402 
N004403 


PASSAGE  WITH  AGE  CONFLICT 
SKIING: PERSONAL  POINT 
BANK  CHECK: WHO  RECEIVES 
BANK  CHECK: NUMBER 
BANK  CHECK: CANT  BE  CASHED 
ATMOSPHERE: OPINION 
BOBBY: SAYS  TALL  IS  SMART 
CRICKETS:  MAKE  SOUNDS  BY  RUBBING  WINGS 
CRICKETS:  WHICH  MAKE  CHIRPING  SOUNDS-ONLY 
CRICKETS:  WHERE  ARE  EARS  -  IN  FRONT  LEGS 
EXTINCT: PAY  A  PRICE  MEANS  GIVE  UP  IN  RETU 
EXTINCT: MANY  MARSUPIALS  IN  AUSTRALIA-NO  C 
EXTINCT: WELL  ADAPTED  SPECIES  MAYBE  OVERSP 
EXTINCT: MARSUPIALS  &  PLACENTALS-MANY  DIFF 
STARS  UNSEEN: STAR  BECOMES  DEAD  BY  USING  U 
STARS  UNSEEN: MAIN  IDEA-STARS  EXIST-WE  CAN 
STARS  UNSEEN: DEAD  STARS  BIG  &  HEAVY-PUSB 
STARS  UNSEEN: RADIO  STAR-ABEA  FILLED  ELEC 
QUICKSAND: HOW  TEST  FOR  IT-POKE  WITH  A  STI 
QUICKSAND: MAIN  PURPOSE-TO  TELL  WAYS  AVOID 
QUICKSAND: IT  IS  SOUPY  SAND  YOU  CAN'T  STAN 
QUICKSAND: IF  STEP  IN, LIE  ON  BACK  &  STRSTC 
SKUNK  CABBAGE: NAME-SMELLS  LIKE  SKUNK, LOOK 
SKUNK  CABBAGE: HARD  TO  SEE-HIDDEN  UNDER  HO 
1ST  AM: BITTER  WINTER-EXTREMELY  COLD 
1ST  AM: ICE  AGE  PEOPLE  DEPENDED  ON  ANIMALS 
1ST  AM: KIND  OF  PEOPLE-WANDERERS  NEEDING  A 
1ST  AM: NO  LAND  BRIDGE  NOW-COVERED  WITH  WA 
1ST  AM: MAIN  PURPOSE-EXPLN  ICE  AGE  SETTLER 
FORD: 1ST  CARS  COSTLY  BECAUSE  TOOK  TIME  TO 
FORD: PROFIT-MONEY  AFTER  EXPENSES  PAID 
FORD: WORK  MADE  EASIER  BY  RAISING  ASSEMBLY 
RUSS  PORTS: WHY  FEW  USABLE-WATER  FROZEN  MO 
RUSS  PORTS: BALTIC  ATTRACTIVE-LINK  INTERIO 
RUSS  PORTS: GREAT  NO. WAR  &  JAPAN  WAR  TO  WI 
RUSS  PORTS: MAIN  PURPOSE-DISCUSS  EFFORTS  G 
RUSS  PORTS: AVENUE  IN  SENTENCE  MEANS  ROUTE 
NAOMI  JAMES:HOW  LONG  OU  SAILING  TRIP-  272 
NAOMI  JAMES: IMPORTANCE  OF  TRIP-BROKE  WORL 
NAOMI  JAMES: WORST  PART  OF  TRIP-  BAD  STORM 
COUSTEAU: PEOPLE  SEEK  ADVENTURE  TO  FIND  OU 
COUSTEAU: LOWER  ODDS-REDUCE  RISKS 


686 


ERIC 


738 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 


422 
423 
424 
425 
426 
427 
428 
429 
430 
431 
432 
433 
434 
435 
436 
437 
438 
439 
440 
441 
442 
443 
444 
445 
446 
447 
448 
449 
450 
451 
452 
453 
454 
455 
456 
457 
458 
459 
460 
461 
462 
463 


H243000- 
H243000- 
H262000- 
H262000- 
H262000- 
H262000- 
H263000- 
H263000- 
H263000- 
H263000- 
H263000- 
H265000- 
H265000- 
H265000- 
H266000- 
H266000- 
H268000- 
H269000- 
H269000- 
H282000- 
H282000- 
H282000- 
H284000- 
H284000- 
H284000- 
H284000- 
H286000- 
H286000- 
H286000- 
H287000- 
H287000- 
H301000- 
H301000- 
H301000- 
0302000- 
B 302000- 
H302000- 
H302000- 
H304000- 
H304000- 
H304000- 
H403000- 


-A004/004 
-A005/005 
■A002/002 
■A003/003 
■A004/004 
-A005/005 
-A002/002 
-A003/003 
-A004/004 
■A005/005 
■A006/006 
■AOOl/OOl 
■A002/002 
■A003/003 
■AOOl/OOl 
■A002/002 
■AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
A004/004 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A004/004 
A005/005 
A002/002 
A003/003 
A004/004 
AOOl/OOl 


N015502 
N015503 
N015504 
N015505 
N002902 
N002903 
N002904 
N002905 
N002906 
N002001 
N002002 
N002003 


N013201 
N010102 
N010103 


N003201 
N003202 
N003203 
N003204 
N004701 
N004702 
N004703 
N013501 
N013502 


N007301 


COUSTEAU:CALM  AFTER  SHARK-WANT  CLEAR  PICT 
COUSTEAU: AQUALUNG  IMPORTANT-ALLOVS  FREER 
CHAMONIX:WHY  SO  LONG  TO  REACH-WINDS  TOO  S 
CHAMONIX:DEVOUASSOU-MAN  WHO  FOUND  CLIMBER 
CHAMONIX:DESMAISOW  SURVIVE  BY  MENTAL/PHYS 
CHAMONIX:WHY  DESMAISON  CR': -OVERCOME  SUFFE 
SOCCER: MOST  POPULAR  BECAUSE  PLAYED  BY  MIL 
SOCCER: KING  ED  WANTED  TO  OIJTLAW-PRACTICE 
SOCCER: CALLED  FOREIGN-IMMIGRANTS  PLAYED  I 
SOCCER: INTRO  TO  ENGLISH  BY  ROMANS 
SOCCER: PELE  MASTER-FOOLED  OPPONENTS  BY  FA 
WISH  COULD  FLY: GOSSAMER  CONDOR  1ST  MUSCLE 
WISH  COU'^  FLY: BIKE  RACER,  BRYAN  ALLEN  FL 
WISH  FLY:MACCREADY  PLANE  DIFF-SIMPLER, LIG 
NO  NICE  BEAR: IN  PAST  SMOKEY  OFFERED  POLIT 
NO  NICE  BEAR:CHNGD  IMAGE  BECAUSE  MORE  FOR 
BULLFIGHT: BULL  CHARGES  CAPE  MOTION 
SANDWICH: NAMED  AFTER  PERSON  WHO  INVENTED 
SANDWICH: WANTED  MEAT  IN  BREAD  TO  EAT  AND 
LABELS :ASPIRIN/5-YR-0LD, TAKE  1/2  TABLET 
LABELS: EXTERNAL  USE-DO  NOT  DRINK 
LABELS :ANTIDOTE/ANTIDOTE-A  TREATMENT 
SUMMER  JOB:SOC  SECURITY  APPLIC  AT  BANK  OR 
SUMMER  JOB: BEST  TIME  TO  FIND  JOB-BEFORE  M 
SUMMER  JOB: NEED  SS  CARD  TO  GET  INTERVIEW 
SUMMER  JOB: REFERENCES- PEOPLE  WHO  KNOW  APP 
CARRIER  AD: IF  INTEREST  &  MEET  REQRMNTS-CA 
CARRIER  AD:8  YR  OLDS  TOO  YOWG  FOR  JOB 
CAKRIER  AD: MUST  DELIVER  PAPERS  BY  7  EACH 
CRIME: HARD  TO  PROVE  OWN  BIKE  IF  REPAINTED 
CRIME: MAIN  PURPOSE-TO  RIVE  SECRET  WAY  TO 
OLYMPIC  AD: MAIN  PURPOSE-ENCOURAGE  SUPPORT 
OLYMPIC  AD: SENTENCE  MEANS  CITIZENS  PROVID 
OLYMPIC  AD: OLYMPIC  GOLD-MEDALS  FOR  WINNER 
INTELL: PURPOSE-BRAIN  SIZE  NOT  DETERMINE  I 
INTELL:USES  FACTS  &  FIGURES 
INTELL: REFERS  TO  EXPERTS  OF  SAME  OPINION 
INTELL: PRETENDS  AGREE  WITH  OTHER  POINT  OF 
FITNESS :PHYS  FITNESS  HELPS  CHILD  BE  HEALT 
FITNESS: AD  PURPOSE-CONVINCE  PARENTS  KIDS 
FITNESS: PARENTS  GET  INFO  FROM  SCHOOL  OR  R 
BRIDGER:KIND  OF  PEOPLE  WERE  MTN  MEN-FUR  T 


687 


739 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID  ETS  ID 


DESCRIPTION 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


464 
465 
466 
467 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
480 
481 
482 
483 
484 
485 
486 
487 
488 
489 
490 
491 
492 
493 
494 
495 
496 
497 
498 
499 
500 
501 
502 
503 
504 
505 


H403000- 
H403000- 
H403000- 
H403000- 
H403000- 
H404000- 
H404000- 
H404000- 
H404000- 
H405000- 
H405000- 
H405000- 
H405000 
H406000 
H406000 
H406000 
H406000 
H408000 
H408000 
H408000- 
H408000- 
H408000- 
H409000- 
H409000- 
H409000- 
H409000- 
H409000- 
H412000- 
H412000- 
H412000- 
H413000- 
H413000- 
H413000- 
H413000- 
H413000 
H416000 
H416000 
H417000 
H417000 
H417000 
H4 17000 
H4 18000 


A002/002 
A003/003 
A004/004 
A005/005 
•A006/006 
■AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A005/005 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-A006/006 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
A006/006 
•A002/002 
■A003/003 
-A004/C04 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 


N007302 
N007303 
1^007304 
N007305 
N007306 
N013101 
N013102 
N013103 
N013104 
N001501 
N001502 
N001503 
N001504 


N008201 
N008202 
N008203 
N008204 
N008205 
N010002 
N010003 


BRIDGER:BEST  DESCRIBE  STORIES-STRETCHED  T 
BRIDGER: SIMILE-PONDS  OF  MUD  BOILING  LIKE 
BRIDGER:WHO  DISCOVERED  LAND  NOW  YELLOWSTO 
BRIDGER: SHOT  MISSED  ELK  BECAUSE  ELK  OUT  0 
BRIDGER:  HYPERBOLE-  LAKES  THAT  HAD  NO  BOT 
THE  COLD: BOY  LEFT  SHADOW-FROZE  TO  SIDE  OF 
THE  COLD: GIRLS  FIGHT  WITH  MELTED  WORDS 
THE  COLD: DUCKS  FLY  AWAY  WITH  POND 
THE  COLD:WRITER  MAKES  STORY  SOUND  PLAYFUL 
NU'"S: DEVIL  PUT  PEARL  IN  WALNUT 
NUTS: FARM  WIFE  WAS  CLEVER  AND  PRACTICAL 
NUTS: WANTED  TRICK  SOMEONE  INTO  CRACKING  W 
NUTS: PLAN  WRONG-WOMAN  DIDN'T  CRACK  WALNUT 
GOOD  DOG: DOG'S  DEATH  DESCRIBED-PAINLESS  & 
GOOD  DOG: SIMILE-LOOKED  WITH  EYES  LIKE  BUL 
GOOD  DOG: MAN  WHO  LIVED  WITH  DOG-CARING  S. 
GOOD  DOG: HYPERBOLE-COULD  EAT  100  LOAVES  0 
BOKUDEN: SUGGESTED  HE  &  SWORDSMAN  FIGHT  ON 
BOKUDEN: THEME-CLEVERNESS  OVERCOME  PHYSICL 
BOKUDEN: BRAGGART  MEANS  BOASTFUL 
BOKUDEN: AFTER  REMOVED  JACKET-SHOVSD  BOAT 
BOKUDEN: MUTEKATSU  HIGHEST  SKILL  BECAUSE  N 
OLD  MAN: GRANDFATHER  EATS  SLOPPY-OLD  &  WEA 
OLD  MAN: GRANDFATHER  FELT  SAD  WHEN  NOT  AT 
OLD  MAN: DISGUSTED  MEANS  ANNOYED 
OLD  MAN: MAN  &  WIFE  CRY  BECAUSE  WAY  TREAT 
OLD  MAN:GRANDSON  MAKE  TROUGH  GIVE  PARENTS 
FUN: MARGIE  LEARNED  ENGLISH  AT  HOME  BY  TV 
FUN: BOOK  WAS  ABOUT  SCHOOL 
FUN: STORY  TAKES  PLACE  IN  FUTURE 
COW-TAIL: OGALOUSSA  WAS  KILLED  WHILE  HUNTI 
COW-TAIL:THEME-PERSON  NOT  DEAD  TILL  FORGO 
COW-TAIL: OGALOUSSA  IS  WISE, FAIR  FATHER 
COW-TAIL: OGALOUSSA  SHAVED  HEAD-RETURNED  F 
COW-TAIL: PULI  GOT  SWITCH-ASKED  ABT  FATHER 
DOG  &  SHADOW: SAW  HIMSELF  IN  THE  STREAM 
DOG  &  SHi.  jW:TEACHES  LESS0:J-GREED  DOESN'T 
FLOWERS: WILLIE  THOUGHT  OK  STEAL-DEAD  MAN 
FLOWERS: IF  GLORIA  l^EW-WOULDN'T  LIKE  THEM 
FLOWERS: AT  END, MAN  FROM  GRAVE  CAME  TO  SEE 
FLOWERS: WILLIE  SAID  BOUGHT-THOUGHT  MOM  TA 
HUMBUG: STRIKING  IT  RICH-FINDING  LOTS  OF  G 


688 


7.10 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

H 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 

R 


506 

507 

508 

509 

510 

511 

512 

513 

514 

515 

516 

517 

518 

519 

520 

521 

522 

523 

524 

525 

526 

527 

528 

529 

■530 

531 

532 

533 

534 

535 

536 

537 

538 

539 

540 


SS  541 

SS  542 

SS  543 

SS  544 

SS  545 

SS  546 

SS  547 


H4 18000 
H4 18000 
H41900O 
H4 19000 
H4 19000 
H419000 
H4i9000 
H422000 
H422000 
H422000 
H441000 
H441000 
H441000 
{.'442000- 
H442000- 
H442000- 
H442000- 
H442000- 
H442000- 
H442000- 
H442000- 
H443000- 
H461000- 
H461000- 
H463000- 
H463000- 
H463000- 
H466000- 
H466000- 
H466000- 
H468000- 
H468000- 
H471000- 
H471000- 
H471000- 
H602000- 
H602000- 
H602000- 
H605000 
H605000- 
H605000- 
H605000- 


-A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
A003/003 
-A004/004 
-A005/005 
-AOOl/OOl 
-A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
-A003/003 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-A006/006 
■A007/007 
■A008/008 
-AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
A002/002 
A003/003 
A004/004 
AOOl/OOl 
A002/002 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
AOOl/OOl 
A002/002 
A003/003 
A004/004 


N010601 
N010602 
N010603 
NO 10604 
NO 10605 
N013401 
N013402 
N013403 


N008101 
N008102 
N008103 
N008104 
N008105 
N008106 
N008107 
11008108 
N007501 


N013601 
N013602 
NO 13603 


N010801 


N005701 
N005702 
N005703 
N007101 
N007102 
N007103 
N007104 


HUMBUG: GOLD  WAS  SUPPOSEDLY  LYING  ON  THE  G 
HUMBUG: GOT  NAMED  BECAUSE  PEOPLE  WERE  FOOL 
THAD: CANDIDATES  FOR  PRES  NOT  ALLOWED  GIVE 
THAD: MAGGIE  THOUGHT  THAD  GOOD  BUT  NEED  HE 
THAD: MASSIVE  STAMPEDE-LOT  OF  PEOPLE  RUSHI 
THAD: EXAGGERATED-CAN  DO  EVERYTHING  IN  YEL 
THAD: MAGGIE  FIRST  HELPED  THAD  WITH  SPEECH 
FROM  THE  PLANET :BOTCHIK  FELT  ANNOYED  AND 
FROM  THE  PLANET: THOUGHT  NO  LIFE-THICK  CLO 
FROM  THE  PLANET: IN  GLASS  CAGE  WAS  A  HUMAN 
TOMATO  MAN: WRITERS  BROUGHT  CAMERA  TO  HIS 
TOMATO  MAN:PUTTIN'  UP  'MATERS  MEANS  CAN  T 
TOMATO  MAN:AFTER  PEELING, SLICE  &  CORE  TOM 
CLOSING: PUN-DOORMAN  AT  PLAZA?  NO 
CLOSING: PUN-MORE  THAN  50  YEARS?  NO 
CLOSING: PUN- END  SWINGING  CAREER?  YES 
CLOSING: PUN-JOB  HAS  HELPED  HIM?  NO 
CLOSING: PUN-UNLOCK  SOME  SECRETS?  YES 
CLOSING: PUN-LOT  HINGES  ON  KINDNESS?  YES 
CLOSING: MAIN  PURPOSE-REPT  SWEENEY  LEAVES 
CLOSING: TONE  OF  CAPTION  IS  CLEVER  AND  WIT 
TRAVELS: MAN  AFRAID-FEARFUL  THOUGHTS, NO  DA 
LETTER  TO  NANCY: WRITER  KNOWS  RELATIONSHIP 
LETTER  TO  NANCY: WRITER  FEELS  REJECTED  AND 
SWINGING/STAR: PEOPLE  LIKE  PIG  IF  LAZY  AND 
SWINGING/STAR: PEOPLE  SHOULD  DIFFER-TRY  BE 
SWINGING/STAR: LINE  4  DOESN'T  RHYME  WITH  0 
TEEVEE:WHAT  MR8.MRS  SPOUSE  NOT  KNOW-HUSBAN 
TEEVEE-"R  &  MRS  SPOUSE  NOT  TALK  BECAUSE  W 
TEEVEE: WRITER  MAKES  POEM  SOUND  FUNNY 
ANGRY: CHILD  COMES  OUT  WHEN  FEELS  BETTER 
ANGRY: CHILD  IS  PERSON  WHO  CAN  DEAL  WITH  0 
SONNET: POET  LIES  TO  MAINTAIN  AN  ILLUSION 
SONNET: BEST  THEME-LOVE  FULL  OF  PLEASING  S 
SONNET: LOVE  MADE  OF  TRUTH  MEANS  NEVER  LIE 
GRAPHtMOST  POWER  1980, 1985, 2000- PETROLEUM 
GRAPH: IN  2000 , HYDROPOWER  SUPPLY  LESS  THAN 
GRAPH: IN  2000  NUCLEAR  POWER  MORE  %  TOTAL 
BUS  SCHED:LAST  BUS  IN  EVENING  LEAVE  CITAD 
BUS  SCHED:2ND  SAl  AM  BUS  ARRIVE  DOWNTOWN 
BUS  SCHEDtMISS  2:35PM  FROM  HANCOCK  WAIT  T 
BUS  SCHEDtLV  RUSTIC  WED  9:42AM  ARRV  DWNTW 


689 


741 


Table  B-1 
(continued) 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


SS  548 

SS  549 

SS  550 

SS  551 

SS  552 

3S  553 

SS  554 

SS  555 

SS  556 

SS  557 

SS  558 

SS  559 

SS  560 

SS  561 

SS  562 

SS  563 

SS  564 

sr.  565 

SS  566 

SS  567 

SS  568 

SS  569 

SS  570 

SS  571 

SS  572 

SS  573 

SS  574 

SS  575 

SS  570 

SS  577 

SS  578 

SS  579 

SS  580 

SS  581 

SS  582 

SS  583 

SS  584 

SS  585 

SS  586 

SS  587 

SS  588 

SS  589 


H606000- 
H606000- 
H606000- 
H606000- 
H607000- 
H607000- 
H607000- 
H607000- 
H621000- 
H621000- 
H622000- 
H622000 
H622000 
H622000 
H622000 
H624000 
H624000 
H624000- 
H624000- 
H624000- 
H627000- 
H627000- 
H627000- 
H627000- 
H629000- 
H642000- 
H642000- 
H642000- 
H642000- 
H643000- 
H643000- 
H643000- 
H643000 
H645000 
H645000 
H645000 
H645000 
H646000 
H646000 
H646000 
H646000 
H646000 


AOOl/OOl 
A002/002 
•A003/003 
.A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005/005 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-AOOi/001 
-A002/002 
-A003/003 
.A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-AOOl/OOl 
-A002/002 
-A003/003 
-A004/004 
-A005  .J5 


N012801  GRAPH: SPENT  MOST  ON  A  BOOK 
N012802  GRAPH: RECORD  COST  $2.50 
N012L03  GRAPH: 5  ITEMS  COST  MORE  THAN  PAINTBRUSH 
N012804  GRAPH: SPENT  SAME  AMOUNT  ON  PAINTS, BIKE  PA 
FEB  CALENDAR:  FEB  18  IS  THURSDAY 
FEB  CALENDAR:  MONDAY  OCCURS  5  TIMES  IN  FE 
FEB  CALENDAR:  FRIDAY  CLOSEST  TO  lOTH  IS  1 
FEB  CALENDAR:  MON  AFTER  THIRD  TUES  IS  FEB 
N006401  TEXTS: INDEX-BEST  PLACE  LOCATE  BULL  RUN/HS 
N006402  TEXTS -.GLOSSARY-BEST  PLACE  FIND  DELTA  DEF. 
N006201  INDEX: FIND  DARIUS  INFO  ON  PG  23 
N006202  INDEX: FIND  CUNEIFORM  PRONUNCIATION 
N006203  INDEX: 1875  FRENCH  CONSTITUTION  INFO  ON  PG 
N006204  INDEX: ALTERNATE  HOG. /DUTCH  EAST  INDIES-IN 
N006205  INDEX: DISARMAMENT  IN  EASTERN  EUROPE  INFO 
N006601  TABLE  CONTENTS: MOST  USEFUL  IN  AMERICAN  HI 
N006602  TABLE  CONTENTS: AMERICAN  INDEPENDENCE  IN  U 
N006603  TABLE  CONTENTS: RECONSTRUCTION  AFT  CIVIL  W 
N006604  TABLE  CONTENTS : MAJOR  TOPIC  CHAP. 17-HAPPEN 
N006605  TABLE  CONTENTS: MIDDLE  EAST  MAP, 1958-1970 
N011901  INDEX: FIND  OUT  ABOUT  SALMON-PGS  84&85 
N011902  INDEX: ALTERNATE  INFO-RAILRDS; TRAVEL  &  TRA 
N011903  INDEX: FIND  MAP  OF  SNAKE  RIVER-PG  84 
N011904  INDEX: FIND  MAP  S.  AMERICAN  RAIN  F0R2STS-P 
N012401  INDEX:ALPHA  LIST  OF  TOPICS  AND  PAGE  NUMBE 
N007004  CATALOG  CD: OTHER  HEADING  I'O  FIND  BOOK-SIE 
N007002  CATALOG  CD:PG  FOR  OTHER  BOOKS  SAME  TOPIC- 
N007003  CATALOG  CD: AUTHORS  OF  BOOK-COOPER  5.  SIEDE 
N007001  CATALOG  CD: WHAT  INFO  GIVES  LOCATION-GV  88 
N012201  DICTIONARY: PLUME  IS  FEATHER 
N012202  LICTIONARY:MORE  '  -vN  1  PLOWMAN  IS  PLOWMEN 
N01220.':  PiiCTIONARY:  PLUNDER-ROB 
N012204  DICTIONARY: PLUM-IMPORTANT  WORK 

DICTIONARY: HOW  SYLLABICATE  HACKBERRY/HACK 
DICTIONARY: PLURAL  OF  HABITUS-  HABITUS 
DICTIONARY : ADVERB  FORM-HABITUAL/HABITUALL 
DICTIONARY :HACKMATACK-A  TYPE  OF  TREE 
N011601  DICTIONARY: DEFINITION  TOME-A  LARGE  BOOK 
N011602  DICTIONARY: TOMORROW  SYLLABICATED-TO  MOR  R 
N011603  DICTIONARY: PLURAL  IS  TONSILLECTOMIES 
NO 11604  DICTIONARY: TOLERANCE  IS  A  NOUN 
N011605  DICTIONARY: TONIC -MAKES  YOU  FEEL  BETTER 


690 


742 


Table  B-1 
(continue(') 

List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 


Type    No.      ECS  ID 


ETS  ID 


DESCRIPTION 


cc 

Dyyj 

H647000- 

-AOO 1/001 

do 

H647000- 

-A002/002 

QQ 

Do 

H647000- 

-A003/003 

CC 

oo 

Oyj 

H647000-A004/004 

CC 
OO 

H647000-A005/005 

SS 

DyD 

H65000C 

-AOOl/OOl 

SS 

H650000-A002/002 

SS 

597 

H650000- 

-A003/003 

SS 

C  AO 

H650000- 

-A004/004 

CC 

oo 

Dyy 

H651000- 

-AOOl/OOl 

CC 

oo 

oUU 

H652000-A001/001 

p 

OUi 

H662000- 

-/OOl/OOl 

p 

H662000- 

-A002/002 

p 

H662000-A003/003 

p 

H662000- 

-A004/004 

p 

H662000- 

-A005/005 

p 

QUO 

H662000- 

-A006/006 

p 

tv 

OU/ 

H662000- 

-A007/007 

p 

oUo 

H662000- 

-A008/008 

p 

H662000- 

-A009/009 

^  1  n 

H662000- 

-AOlO/OlO 

p 

tv 

Ai  1 

Oil 

H662000- 

-AOll/011 

p 

tv 

Ai  0 

DiZ 

H662000- 

-A012/012 

p 

tv 

01 J 

H662000- 

-A013/013 

p 

IV 

0  XH 

H662000- 

-A014/014 

p 

IV 

H662000- 

-A015/015 

k 

616 

H662000- 

-A016/016 

R 

617 

H662000- 

-A017/017 

R 

618 

H662000- 

-A018/018 

R 

619 

H662000- 

-A019/019 

R 

620 

H662000- 

-A020/020 

R 

621 

H662000- 

-A021/021 

R 

622 

H662000- 

-A022/022 

R 

623 

H662000- 

-A023/023 

R 

624 

H662000- 

-A024/024 

R 

625 

H662000- 

-A025/025 

R 

626 

H662000- 

-A026/026 

R 

627 

H662000- 

-A027/027 

R 

628 

H662000- 

-A028/028 

R 

629 

hj62000- 

-A029/029 

R 

630 

H662000- 

-A030/030 

SS 

631 

H663000- 

-AOOl/OOl 

N012701 
N012702 
N012703 
N012704 
N011501 
N012501 


GUIDE  WDS:PAGE  558  FOR  MUSK 
GUIDE  WDS:PAGE  560  FOR  MYSTERY 
GUIDE  WDS:PAGE  561  FOR  NAIAD 
GUIDE  WDS:PAGE  559  FOR  MUZHIK 
GUIDE  WDS:PAGE  560  FOR  MYSTIC 
ENCYCLOPEDIA: INFO  ON  MEXICO  IN  VOLUME  6 
ENCYCLOPEDIA: INFO  ON  INVENTIONS  OF  EDISON 
ENCYCLOPEDIA: INFO  ON  lOWA  FARM  PRODUCTS  I 
b»  .•YCLOPEDIA:INFO  ON  N.Y.  RIVERS  &  LAKES 
DICTIONARY: TO  FIND  WORD  MEANING-DICTIONAR 
ENCYCLOPEDIA: TO  FIND  INFO  ON  WHALE  FOOD-K 


N006001 


GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
GALAPAGOS 
PHONE  DIR 


COLONIZATION  UNDER  HUMANS  ON  TH 
OCEAN  AREA  APRX  36,000  SQ  MILES 
LOWLANDS-CINDER  WITH  SHARP  EDGE 
WOODPECKER  FINCH  USES  TOOL  TO  F 
DAGGERS  LEFT  BY  BUCCANEERS 
JERVIS  ISLAND  ABOUT  5  MI.  S.  OF 
RESEARCH  STATION  ON  INDEFATIGAB 
ISLANDS  ERUPTED  FROM  SEA  FLOOR 
MELVILLE  SAYS  ROCK  RODONDO  LIKE 
VILLIERS  WROTE  ABOUT  RETRACING 
INDEFATIGABLE  NAMED  AFTER  SHIP 
DARWIN  THEORY-NATURAL  SELECTION 
CORMORANT  CAN'T  FLY 
CALLED  CROSSROAD-2  CURRENTS  MEE 
SAILORS  WITH  MELVILLE  WERE  WHAL 
SADDLE-SHAPE  SHELLS-TALL  CACTI 
COLONIZATION  FAILED-STRIFE  &  RE 
NARBOROUGH-MOST  AWESOME 
ENG.  NAME  SANTA  MARIA  IS  CHARLE 
4  VEGETATION  ZONES 
SUGGEST  CAREFULLY  MANAGE  TOURIS 
SEYCHELLES  ONLY  OTHER  PLACE  LAN 
SOME  SPECIES  bURVIVED  BECAUSE  D 
PASS  BARRINGTON  IF  SAIL  ACADEMY 
IDEAL  FOR  VARIETY -WARM  &  COOL  C 
HUMAN  THREAT-NEW  PLANTS  &  ANIMA 
WELLINGTON  BEST  RECENT  SOURCE 
BISHOP  DISCOVERED 
MARINE  IGUANA  AT  ESPINOSA  POINT 
SHARP-BEAKED  GROUND  FINCH  DRINK 
STORES  SELL  MILK  LISTED  UNDER  D 


691 


743 


Table  B-1 
(continued) 


List  of  Items  Initially  Considered  for  Trend  Analysis 
(R=Reading,  SS=Study  Skills) 

Type    No.      ECS  ID  ETS  ID  DESCRIPTION 

SS  632  H663000-A002/002  N006002  PHONE  DIR: HENDRICKS  MINING  ON  63RD  ST,  AA 
SS    633  H663000-A003/003  N006003  PHONE  DIR: STAR  TRACKER  OPEN  TO  REPAIR  MIC 


692 


o  744 
ERIC 


Table  B-2 


Item  Parameter  Estimates  and  Standard  Errors 


Age  9  Trend  Data 

Item 

a 

s • e • (a) 

b 

s.e.(b) 

p 

o  .  c  .  v  ^  / 

4 

1.45665 

0. 15096 



0.96805 



0.25239 

0  00R3S 

7 

1.62013 

0.18008 

-0.66107 

0.08822 

0  91S1ft 

8 

2.87125 

0.31297 

-1.05092 

0.08273 

9 

1.95374 

0. 19194 

-0.99160 

0.06814 

0  1  ftQAQ 

0  0307Q 

10 

0.66239 

0. 12542 

0.92530 

0.35950 

11 

0.32097 

0.03743 

-0.27738 

0.08125 

15 

1.72384 

0. 17108 

-1.71880 

0.13016 

0  1  ROAR 

1.82295 

0. 16808 

-0.94913 

0.06309 

0  1  ftQOO 

60 

1.65532 

0.09796 

-1.93793 

0.09259 

0  1A9'^0 

66 

0.94869 

0.11392 

-2.77999 

0.25983 

0. 1fllQ9 

0  0SS19 

67 

0.91878 

0.07605 

-1.14013 

0.05928 

0  16063 

0  0A0ft3 

76 

1.17304 

0.05089 

-1.29151 

0.03818 

0. 19673 

0  09ft09 

.  \jCO\jC 

77 

1.25806 

0.07384 

-1.28936 

0.05040 

0  916A7 

0  03A13 

78 

0.94913 

0.09982 

-1.37673 

0.08868 

0  17flH 

0  OA777 

79 

1.45911 

0. 14260 

-1.46350 

0.09451 

0  17109 

0  03QftQ 

80 

1.19967 

0.06978 

-1.99588 

0.08408 

0  lft77Q 
v . xo f  f  y 

0  0Afi7Q 

81 

1.12587 

0. 11218 

-1.94154 

0.13624 

0.1  72^^*5 

82 

1.32448 

0. 14185 

-1.70034 

0.12522 

0  99*^^7 

n  ^AQQ«^ 

83 

1.39426 

0.17551 

-2.74479 

0.30524 

0  1 AOAO 

0  0S3SQ 

84 

1.57546 

0. 18945 

-2.55817 

0.28621 

0  lAAftO 

0  OS3A9 

85 

0.93831 

0. 14278 

-3.49095 

0.43919 

0. lfllQ9 

V/  .  XOX  7 £ 

0  OSAAA 

86 

1.59383 

0.08969 

-1.46392 

0.05524 

0. 13331 

0  09QAR 

87 

0.86089 

0.04992 

-1.52366 

0.05598 

0. 11620 

\/  .  X  X  V£i  V 

0  033Qfl 

88 

1.66925 

0. 10922 

-2.21662 

0.12506 

0.1A307 

0  OAlAl 
V/  .  u*t  xox 

89 

1.09681 

0. 10441 

-2.21673 

0.15663 

0. 15A38 

0  OSOSS 

90 

1.50519 

0. 13140 

-1.87898 

0.12709 

0. 1 AOlO 

0  03QAn 
\J  .  UJ  7*4U 

91 

1.14538 

0.07313 

-2.16925 

0.10265 

0  17387 

0  OA7fi9 

92 

1.44482 

0.08494 

-1.97198 

0.08851 

0  13fll3 

.  X  JOX <J 

0  03679 

93 

1.02755 

0.06403 

-2.18892 

0.09931 

0  1 3ARS 

0  0A3SS 

94 

0.92444 

0.04975 

-1.06644 

0.04211 

0. 1 7AA9 

0  031Qft 

97 

1.21940 

0.09517 

-2.87148 

0.18765 

0  16796 

\J  •  X\f  f  7  V 

0  0S013 

98 

0.66907 

0.07745 

-1.38469 

0.09564 

0  18951 

0  0S3fiA 

99 

1.56126 

0.15996 

-2.06415 

0.17448 

0. 18542 

0.05126 

101 

1.69359 

0.14875 

-1.85667 

0.12591 

0.37038 

0.04774 

102 

2.64499 

0.25305 

-1.94163 

0.20195 

0.41954 

0.04549 

103 

1.32585 

0.12925 

-2.40079 

0.18891 

0.41107 

0.06195 

104 

1.20762 

0.11853 

-0.96681 

0.07508 

0.39818 

0.03685 

106 

0.66741 

0.14401 

0.68002 

0.35685 

0.19696 

0.03617 

107 

1.22871 

0.08880 

-0.23181 

0.07365 

0.12205 

0.01780 

108 

1.24882 

0.17635 

0.22229 

0.20459 

0.12415 

0.0205. 

112 

1.43833 

0.08189 

-1.82431 

0.07591 

0.13674 

0.03630 

113 

1.06728 

0.14794 

0.27592 

0.19320 

0.11016 

0.02078 

693 


745 


Table  B-2 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  9  Trend  Data 


IteiK 

115 

120 

121 

122 

123 

124 

127 

128 

130 

149 

150 

157 

159 

160 

161 

166 

167 

168 

169 

170 

171 

174 

175 

178 

189 

190 

193 

196 

198 

199 

200 

205 

207 

208 

214 

216 

219 

220 

221 

222 

223 


0.67148 

0.85304 

1.41027 

1.00560 

1.05064 

0.69985 

1.03001 

0.95646 

1.17113 

1.63001 

1.65918 

1.34995 

0.99231 

1.96251 

0.82708 

0.97904 

1.78542 

2.50099 

1.13560 

0.65379 

0.82014 

0.95979 

1.25596 

1.11200 

1.33751 

0.83547 

1.05468 

1.80469 

1.13646 

1.25632 

0.69185 

1.25845 

1.03764 

1.22173 

1.00078 

1.12231 

1.30714 

0.90441 

1.04075 

0.86623 

1.73741 


s.e.(a) 

0.12101 
0.09532 
0.14619 
0.10451 
0.10112 
0.15599 
0.09766 
0.09471 
0.07637 
0.17429 
0.09793 
0.11818 
0.14202 
0.19666 
0.13658 
0.14260 
0.22306 
0.24802 
0.18892 
0.06804 
. 19706 
.17436 
.21244 
.13976 
.11569 
.08371 
.11985 
0.22825 
0.10316 
0.14721 
0.25980 
0.16086 
0.10284 
0.09646 
0.10760 
0.16807 
0.17360 
0.09095 
0.05843 
0.09203 
0.17329 


0. 
0. 
0. 
0. 
0. 
0. 
0. 


0.08012 
-1.11652 
-1.82682 
-2.20773 
-1.64216 
0.57315 
-1.84650 
-1.58179 
-1.15664 
-0.33295 
-1.20137 
-1.69445 
0.09872 
-1.82268 
0.44922 
-0-0829'; 
-6.05461 
0.39756 
1.10525 
-0.14449 
1.66266 
1.28951 
1.42966 
0.12194 
-1.07488 
-1.97905 
-0.72610 
0.71565 
-1.70297 
-0.50009 
3.38179 
0.83819 
-1.47947 
-0.13467 
-1.45260 
0.37240 
-0.01111 
-2.41293 
-1.86921 
-1.50656 
-2.17127 


s.e.(b) 

0.19898 

0.07640 

0.13735 

0.16776 

0.10258 

0.34902 

0.11914 

0.09985 

0.04901 

0.11058 

0.04581 

0.10373 

0.17424 

0.15528 

0.24014 

0.15536 

0.18073 

0.25583 

0.38822 

0.09662 

0.64085 

0.43469 

0.46366 

0.15558 

0.05902 

0.13389 

0.07396 

0.30623 

0.10388 

0.09667 

1.68492 

0.26563 

0.09058 

0.08884 

0.09642 

0.22812 

0.16674 

0.17891 

0.07365 

0.09917 

0.19391 


0. 
0. 
0. 
0. 
0. 


0.17877 
0.18592 
0.20668 
0.18723 
0.16097 
0.21732 
0.14967 
0.16187 
0.17500 
0-12634 
0.17605 
0.15630 
0.14339 
.18112 
.09861 
.17794 
,17162 
.16633 
0.24602 
0.13768 
0.17149 
0.20960 
0.13542 
0.08180 
0.10176 
0.15101 
0.14852 
0.14841 
0.13106 
0.18098 
0.07379 
0.08779 
0.14524 
0.16783 
0.19838 
0.12059 
0.15770 
0.13717 
0.16039 
0.18043 
0.14294 


s.e.(c) 

0.04343 
0.04647 
0.05139 
0.05418 
0.04505 
0.03902 
0.04383 
0.0453' 
0.03294 
0.02115 
0.02648 
0.04127 
0.02672 
0.04375 
0.02574 
0.03267 
0.01986 
0.01313 
0.01139 
0.03282 
0.01407 
0.01316 
0.00863 
.01965 
.02931 
.04585 
.03643 
.01250 
.04225 
.03112 
.00878 
.00940 
.04375 
.01807 
.05033 
.02030 
.02433 
.04662 
.04590 
.04937 


0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0- 
0. 
0. 
0. 


0. 
0. 
0. 


U. 04481 


694 


748 


Table  B-2 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  9  Trend  Data 


I  ten 

224 

225 

243 

244 

245 

246 

247 

248 

249 

250 

252 

253 

268 

279 

336 

337 

338, 

339 

340 

341 

342 

344 

347 

348 

349 

350 

351 

352 

353 

355 

336 

357 

362 

363 

365 

366 

370 

371 

372 

373 

374 


1.03601 

1.13881 

1.11405 

0.86056 

1.39376 

0.42236 

2.47745 

1.52161 

1.18964 

2.21517 

1.14745 

0.64017 

0.46158 

1.70130 

1.07139 

0.83792 

1.24921 

1.16341 

1.25119 

1.10890 

1.65120 

1.21641 

1.09970 

1.33043 

0.94136 

0.43383 

0.45252 

2.36995 

1.81794 

1.46610 

1.34766 

1.14389 

0.85569 

2.09948 

1.65587 

1.47509 

0.93695 

1.45610 

1.45345 

2.03013 

1.55085 


0. 
0. 
0. 
0. 


s.e.(a) 

0.05628 
0.06317 
0.04689 
0.04431 
.07364 
.06082 
19689 
, 14305 
0.09125 
0.16677 
0.12281 
0.0/ 7«4 
0.11309 
0.11453 
0.06248 
0.07518 
0.13382 
0.11091 
0.11450 
0.06620 
0.10490 
0.11800 
0.46414 
0.12852 
0.25125 
0.04097 
0.18486 
0.20085 
0.19623 
0.09872 
0.06790 
0.07424 
0.11883 
0.23640 
0.15931 
0.13858 
0.09441 
0.08253 
0.08452 
0.12710 
0.12617 


-1.54125 
-1.56618 
-1.55683 
-2.-  640 
-2.31927 
-5.09262 
-1.79059 
-2.45148 
-1.87708 
-1.71067 
-2.90807 
-2.76341 
-0.07543 
-2.18679 
-1.81221 
-0.85033 
-1.35032 
-0.7S563 
-1.41784 
-1.99837 
-1.60878 
-1.65139 
2.22396 
-2.13280 
2.10038 
-0.05264 
5.70807 
-1.93460 
-0.96893 
-2.09186 
-1.92250 
-1.45168 
-0.78045 
0.54432 
-1.74025 
-1.95600 
-0.95998 
-0.37323 
-1.75548 
-1.93365 
-2.17935 


s.e.(b) 

0.05345 

0.05642 

0.04108 

0.07504 

0.10130 

0.62155 

0.16075 

0.21028 

0.10397 

0.12910 

0.25957 

0.24862 

0.24320 

0.11713 

0.07253 

0.04917 

0.09046 

0.06026 

0.08324 

0.08525 

0.07133 

0.10876 

1.47631 

0.16105 

0.85089 

0.08992 

2.66925 

0.16690 

C. 07681 

0.10951 

0.07152 

0.06249 

0.09223 

0.2/J65 

0.12735 

0.14327 

0.05837 

0.05326 

0.07298 

0.10840 

0.14486 


0.10274 
0.11191 
0.00000 
0.00000 
0.00000 
0.13154 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.14679 
0.47724 
0.16811 
0.13690 
0.08419 
0.20965 
0.10637 
0.14447 
0.14442 
0.21153 
0.18572 
0.22170 
0.16465 
0.14704 
0.10881 
0.12353 
0.17220 
0.24766 
0.21065 
13172 
.20633 
.28464 
12095 
0.18534 
0.  ...681 
0.10667 
0.11360 
0.21959 
0.13714 
0.17586 


0. 
0. 
0. 
0. 


s.e. (c) 

0.03083 
0.03177 
0.00000 
0.00000 
0.00000 
0.05178 
0.00000 
0.00000 
0.00000 

o.coooo 

0.00000 
0.05031 
0.05694 
0.04729 
0.03869 
0.0273/ 
0.04700 
0.02943 
0.03794 
0.04108 
0.03731 
0.04636 
0.01363 
0.04706 
0.00905 
0.03428 
0.01099 
0.04030 
0.03310 
0.05111 
0.03504 
.04157 
.049?7 
.0.1233 
.04517 
0.04001 
0.03148 
0.01350 
0.04232 
0.03284 
0.04945 


0. 
0. 
0. 
0. 


695 


ERIC 


747 


Table  B-2 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  9  Trend  Data 


Item 

375 

377 

378 

379 

385 

337 

388 

389 

398 

399 

400 

401 

402 

403 

404 

405 

406 

407 

408 

417 

418 

419 

433 

434 

435 

438 

439 

440 

441 

442 

443 

448 

449 

450 

451 

45? 

469 

470 

471 

472 

473 


a 

1.34915 
1.04551 
1.69410 
1.10467 


82456 
70937 
42104 
25237 
16733 
20873 
66343 
1.95995 
1.73397 
1.03200 
0.81103 
1.10564 
1.93534 
1.19942 
0.94338 
1.33547 
0.87379 
1.60306 
0.81859 
1.18256 
1.34429 
1.92126 
1.23265 
1.64252 
0.73215 
1.15543 
1.18965 
1.08361 
0.75963 
1.12612 
1.51908 
1.25185 
1.04435 
1.17287 
0.97865 
1.30829 
2.43421 


s.e.(a) 

0.13452 

0.20001 

0.15363 

0.06725 

0.22375 

0.15263 

0.12617 

0.08340 

0.21882 

0.08408 

0.12694 

0.13244 

0.14022 

0.10047 

0.06516 

0.09691 

0.20619 

0.18510 

0.12559 

0.10657 

0.14376 

0.14815 

0.08661 

0.11698 

0.14873 

0.18587 

0.15929 

0.16220 

0.07352 

0.27870 

0.12674 

0.12266 

0.08890 

0.12760 

0.15940 

O.lOS'l 

0.07976 

0.09262 

0.08060 

0.13259 

0.24895 


0.46445 
0.36449 
-1.51611 
-1.37773 
0.53405 
-1.2958 J 
-0.84776 
-1.20106 
-1.61968 
-1.39937 
-1.54185 
-1.34619 
-0.93326 
-0.63983 
-1.26452 
-0.78831 
-0.62723 
0.08668 
0.15715 
-2.13783 
-0.08002 
-1.40450 
-0.02157 
-0.24546 
-0.37569 
-0.70323 
-0.3P963 
-1.44315 
-1.48787 
1.04316 
-0.42123 
-0.86218 
-1.18412 
-0.72118 
-1.19059 
-0.72960 
-1.85361 
-0.90836 
-0.72898 
-0.78289 
-1.38075 


s.e. (b) 

0.18176 

0.28720 

0.09780 

0.05322 

0.26473 

0.07823 

0.06042 

0.05129 

0.14088 

0.06249 

0. 08238 

0.06509 

C. 05926 

0.07386 

0.0C362 

0.06296 

0.09013 

0.19940 

0.17163 

0.13200 

0.16922 

0.08790 

0.11156 

0.10031 

0.10544 

0.07761 

0.11903 

0.09785 

0.08994 

0.54183 

0.09000 

0.07803 

0.08352 

0.08038 

0.08008 

0.06505 

0.09809 

0.05645 

0.05748 

0.07138 

0.11377 


0. 
0. 


0.20895 
0.18558 
0.12665 
0.14688 
0.08996 
0.15965 
0.12597 
0.12811 
0.14230 
0.136O-' 
0.18  . 
0.12551 
,24043 
,21860 
0.15731 
0.19519 
0.17492 
0.18363 
0.20520 
0.17656 
0.20247 
0.14686 
0.12768 
0.19407 
0.17814 
0.14777 
0.20591 
0.18021 
0.15673 
0.14968 
0.12832 
0.19149 
0.18537 
0.17794 
0.19673 
0.21056 
0.18646 
0.17833 
0.15825 
0.17652 
0.20367 


s.e. (c) 

0.01253 

0.02745 

0.03532 

0.03455 

0.01158 

0.03575 

0.02964 

0.03096 

0.03626 

0.03545 

0.03922 

0.02678 

0.03013 

0.03546 

0.04268 

0.03503 

0.02754 

0.02676 

0.02732 

0.05044 

0.03893 

0.03775 

0.02655 

0.02473 

0.02922 

0.02830 

0.03473 

0.04548 

0.04578 

0.01816 

0.02891 

0.04513 

0.05284 

0.04052 

0.04207 

0.03182 

0.05149 

0.03448 

0.03284 

0.03488 

0.03706 


696 


748 


Table  B-2 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  9  Trend  Data 


Item 

474 

475 

476 

500 

501 

502 

503 

504 

505 

506 

507 

508 

509 

510 

511 

512 

513 

514 

515 

530 

531 

532 

536 

537 

636 

640 

643 

647 

648 

649 

650 

651 

652 

654 

655 

656 

657 

658 

65v 

660 

661 


1.34128 
1.43855 
0.89898 
1.41644 
1.31145 
1.04475 
1.53340 
1.02968 

0.  93224 
1.55987 
1.42387 
1.63450 
1.11028 
1.05518 
1.75982 
1.33454 

1.  P2310 
1.29440 
1.23656 
1.16728 
0.51957 
1.26473 
0.85027 
1.13481 
1.12831 
1.27522 
1.10518 
0.69760 
0.59160 
0.68113 
0.96024 
0.52144 
0.86335 


1.64918 
1.04616 
0.91008 
0.97771 
1.11430 
1.85222 
0.86353 
1.20671 


s.e. (a) 

0.14595 

0.10385 

0.14535 

0.14586 

0.11815 

0.10269 

0.14219 

0.10303 

0.11022 

0.14574 

0.24339 

0.17901 

0.13251 

0.15680 

0.26286 

0.19020 

0.13242 

0.12827 

0.13764 

0.12330 

0.11242 

0.13073 

0.08372 

0.14134 

0.12857 

0.15208 

0.11147 

0.11023 

0.07916 

0.23487 

0.14577 

0.10399 

0.30370 

0.23320 

0.10684 

0.12226 

0.10406 

0.12583 

0.19199 

0.10333 

0.14872 


-0.49917 
-1.34763 
0.00572 
-1.13239 
-1.37300 
-1.03543 
-1.15922 
-0.98596 
-1.22248 
-1.11662 
0.33306 
-0.67196 
-0.64978 
-0.09065 
-0.04544 
-0.26224 
-0.53178 
-1.12835 
-0.28220 
-1.29504 
0.70361 
-1.20018 
-0.74637 
-0.36559 
-2.23942 
-0.53147 
-1.57629 
-0.16664 
-0.62442 
2.40683 
0.08063 
0.02906 
2.17909 
-0.05076 
-1.54971 
-0.52723 
-1.22482 
-0.92637 
-1.76251 
-2.34647 
-0.75734 


s.e. (b) 

0.09335 

0.06436 

0.18115 

0.07708 

0.07896 

0.06383 

0.06875 

0.06324 

0.08908 

0.06733 

0.28322 

0.08493 

0.08506 

0.16117 

0- 21847 

0.15239 

0.09803 

0.07108 

0.10436 

0.08520 

0.35092 

0.07778 

0.06895 

0.10900 

0.18994 

0.09437 

0.10394 

0.14829 

0.07977 

1.16716 

0.18609 

0.20 '^*5 

1.14U81 

0.19829 

0.10161 

0.10335 

0.08104 

0.07852 

0.15095 

0.20144 

0.08486 


0.17519 

0.18225 

0.20719 

0.18685 

Oo 13340 

0.13807 

0.14062 

0.14524 

0.21514 

0.14629 

0.17263 

0.19917 

0.18948 

0.18864 

0.25424 

0.27183 

0.20477 

0.15587 

0.10905 

0.19283 

0.16853 

0.18034 

0.19505 

0.15987 

0.18080 

0.16481 

0.17985 

0.19070 

0.15105 

0.20290 

0.17753 

0.22364 

0.16658 

0.2ir;'7. 

0.17by5 
0.20468 
0.17939 
0.19640 
0.17189 
0.17788 
0.20772 


s.e.(c) 

0.03115 

0.03798 

0.03544 

0.04471 

0.03696 

0.03753 

0.03465 

0.03729 

0.05387 

0.03411 

0.02080 

0.03090 

0.03758 

0.03251 

0.02367 

0.03252 

0.03916 

0.03855 

0.02429 

0.04778 

0.04218 

0.04382 

0.04092 

0.03232 

0.05404 

0.03337 

0.05068 

0.04640 

0.04815 

0.02061 

0.03365 

0.05564 

0.01693 

0.02435 

0.05094 

0.04590 

0.04940 

0.04653 

U. 04698 

0.05367 

0.04083 


697 


749 


Table  B-2 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  9  Trend  Data 


Item 

a 

s.e. (a) 

b 

662 

1 

.80891 

0.19489 

-1. 

82219 

663 

1 

.84316 

0.21016 

-0. 

49321 

664 

1 

.56465 

0.18873 

-0. 

47706 

665 

1 

. 79408 

0.19723 

-1. 

91959 

666 

1 

.37500 

0.14311 

-1. 

14685 

s.e. (b) 

0.16094 
0.11317 
0.11480 
0.17800 
0.07887 


0. 
0. 
0. 

0. 
0. 


18721 
20118 
22790 
18332 
19433 


s.e. (c) 


05083 
,02834 
,03125 
,05134 
.04550 


698 


ERIC 


750 


Table  B-3 


Item  Parameter  Estimates  and  Standard  Errors 
Age  13  Trend  Data 


Item 

1 

2 
4 
5 
6 

10 
11 

12 
13 
14 
16 
17 
19 
20 
21 
22 
51 
52 
53 
54 
55 
56 
57 
58 
59 
61 
65 
66 
69 
70 
71 
72 
73 
74 
75 
76 
92 
94 
96 
S8 
99 
106 
111 
113 


1.59669 
1.56585 
1.02543 
1.00796 
0.85429 
0.97733 
0.53348 
2.28036 
1.05428 
1.26094 
1.58373 
1.54804 
0.68727 
1.12722 
2.07925 
1.03389 
0.77897 
0.56916 
1.56437 
1.06172 
1.98289 
1.80407 
0.75765 
0.88345 
0.70439 
0.77015 
0.95088 
0.91527 
1.70230 
0.96963 
0.80993 
1.06220 
1.54911 
1.61040 
2.18640 
1.17766 
0.94269 
0.99176 
1.14800 
1.20288 
0.84890 
1.20270 
0.92349 
1.19757 


s.a.(a) 

0.09324 

0.06725 

0.07970 

0.11719 

0.05997 

0.10085 

0.05121 

0.14199 

0.14578 

0.16461 

0.09897 

0.09796 

0.052b^ 

0.13906 

0.24649 

0.10512 

0.25343 

0.12481 

0.19188 

0.12614 

0.19297 

0.21783 

0.18116 

0.35001 

0.18681 

0.09384 

0.13330 

0.15786 

0.11925 

0.07606 

0.07230 

0.08377 

0.10640 

0.10880 

0.14396 

0.05648 

0.07555 

0.05087 

0.11845 

0.14508 

0.14715 

0.12138 

0.11275 

0.14462 


-0.79312 
-0.72331 
0.45207 
-0.00003 
-0.33567 
0.45979 
-0.52690 
1.32833 
1.36803 
1.68690 
-1.02675 
-0.38620 
-0.75600 
-1.82493 
-1.33136 
-1.19468 
2.29684 
1.33634 
0.40827 
0.28327 
0.07802 
0.35885 
1.57444 
3.04480 
1.84646 
-1.57914 
0.53724 
.80040 
.66598 
.06109 
0.38087 
0.34509 
-0.38481 
0.00975 
0.15672 
-1.11702 
-^1.89281 
-1.16519 
-1.54394 
-1.09817 
-2.66391 
-0.15590 
0.59882 
0.33611 


-2. 
-0. 
-0. 


s.e.(b) 

0.06893 
0.04851 
0.06063 
0.06423 
0.05252 
0.07949 
0.08108 
0.13528 
0.21623 
0.25396 
0.08995 
0.05018 
0.08050 
0.26332 
0.25648 
0.14785 
0.76200 
0.29640 
0-09965 
0.07264 
0.06055 
0.09821 
0.37991 
1.24221 
0.48916 
0.21587 
0.10693 
0.52411 
0.07883 
0.03339 
0.03879 
0.03654 
0.05335 
0.03461 
0.03674 
0.06996 
0.17054 
0.07705 
0.19134 
0.16472 
0.49495 
0.06050 
0.09206 
0.08115 


0.16047 
0.00000 
0.18086 
0.17561 
0.18056 
0.27963 
0.22274 
0.10548 
0.28201 
0.14184 
0.18467 
0.21044 
0.20467 
0.22530 
0.20114 
0.18536 
0.24235 
0.21088 
0.25364 
0.18126 
0.15424 
0.24294 
0.15702 
0.14331 
0.14683 
0.22203 
0.19689 
0.22035 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.22649 
0.20516 
0.24705 
0.13873 
0.27872 
0.21940 
0.14699 
0.12342 
0.19502 


s.e.(c) 

0.03839 
0.00000 
0.02679 
0.04379 
0.04105 
0.03209 
0.05446 
0.00713 
C. 01852 
0.01102 
0.04394 
0.03485 
0.05125 
0.05961 
0.05075 
0,04971 
0.02525 
0.04429 
0.03077 
0.03697 
0.02918 
0.03041 
0.03079 
0.01460 
0.03016 
0.05905 
0.03825 
0.05939 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.04927 
0.05472 
0.05469 
0.04795 
0.06557 
0.05876 
0.03857 
0.03099 
0.03591 


699 

751 

o 

ERIC 


Tabl,?  B-3 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  13  Trend  Data 


item 

a 



116 



1.53451 

IIB 

1.45582 

119 

0.58817 

121 

1.34064 

124 

0.88448 

126 

1.19999 

131 

1.28959 

132 

0.68989 

133 

1.53804 

134 

0.37381 

135 

1.27977 

136 

1.25902 

137 

1 .61458 

138 

1.56286 

139 

1.20676 

140 

0.98930 

141 

19419 

143 

1.39277 

144 

0.84293 

146 

0.94167 

151 

1.57669 

152 

1.62846 

154 

1.11360 

155 

1.33410 

156 

0.43739 

161 

0.70020 

166 

1.05529 

167 

1.35363 

168 

1.51210 

169 

0.73604 

170 

0.49417 

171 

0.86872 

172 

1.60819 

173 

0.78839 

174 

0.87028 

176 

1.08070 

177 

0.64669 

178 

1.39032 

180 

1.67296 

181 

0.41095 

132 

1.67108 

183 

1.90013 

184 

2.76375 

s.e.(a) 

0.09436 
0.08787 
0.05523 
0.16727 
0  10726 
0.13285 
0.12507 
0.05077 
0.13257 
0.04464 
0.10897 
0.09101 
0.11286 
0.16256 
0.10904 
0.11030 
0.07091 
0.09653 
0.17586 
0.21365 
0.14607 
0.15165 
0.10755 
0.13986 
0.16024 
0.08487 
0.10363 
0.12715 
0.14793 
0.08029 
0.03907 
0.08187 
0.16330 
0.08013 
0.08819 
0.39388 
0.21068 
0.13301 
0.14542 
0.03595 
0.08483 
0.09132 
0.12646 


-1.11160 
-0.46581 
-2.20859 
-1.79824 
-0.17137 
-0.01589 
-1.10239 
0.01906 
0.03750 
-2.55330 
0.20932 
-0.30925 
0.46445 
-1.04807 
-0.49670 
-0.05248 
-0.53257 
0.09931 
1.07581 
1.98451 
0.23289 
-0.42908 
-0.65840 


44459 
70862 
37928 
67225 
57525 
09857 
72449 
0.87892 
0.71826 
1.29110 
1.00407 
0.90340 
2.74079 
2.80808 
0.55855 
0.08458 
1.44278 
0.68204 
-0.41888 
•0.02558 


s.e.(b) 

0.09251 
0.05068 
0.22257 
0.27852 
0.07267 
0.06935 
0.i:}984 
0.0"015 
0.05660 
0.32110 
0.06195 
0.05517 
0.06078 
0.14887 
0-07370 
0.06666 
0.05335 
0.04490 
0.25875 
0.47576 
0.05702 
0.07411 
0.09082 
0.19189 
0.98583 
0.06596 
0.09300 
0.08489 
0.05987 
.09719 
.08515 
.08408 
18178 
,12300 
,10679 
.05989 
.91453 
0.08444 
0.05108 
0.12409 
0.05719 
0.04326 
0.03067 


0. 
0. 
0. 
0. 
0. 
0. 
1. 
0. 


13645 
16973 
21818 
21453 
18722 
0.21113 
0.18453 
0.10670 
0.24944 
0.23212 
0.34308 
0.31344 
0.18000 
0.19909 
0.12228 
0.19580 
0.21024 
0.21217 
0.55192 
0,26071 
0.09484 
0.14280 
13158 
13718 
36504 
10757 
14074 
0.14139 
0.15647 
0.17979 
0.11392 
0.14879 
0.13407 
0.11848 
0.18969 
0.09458 
0.14116 
0.14897 
0.09911 
0.11177 
0.00000 
0.00000 
0.00000 


0. 
0. 
0. 
0. 
0. 


s.e.(c) 

0.04295 
0.03572 
0.05866 
0.05699 
0.05080 
0.04377 
0.04791 
0.03408 
0.03272 
0.06189 
0.02921 
0.03631 
0.01786 
0.05025 
0.03919 
0.04515 
0.03931 
0.02646 
0.02809 
0.01745 
0.02371 
0.03691 
0.04261 
0.04716 
0.03397 
0.03520 
0.04572 
0.04310 
0.03723 
0.03468 
0.03992 
0.02669 
0.04516 
.04142 
.02483 
.01220 
.02316 
.04074 
.02726 
0.02241 
0.00000 
0.00000 
0.00000 


0. 
0. 
0. 
0. 
0. 
0. 


700 


ERIC 


752 


Table  B  -3 
(continued) 


Item  Paiameter  Estimates  and  Standard  Errors 
Age  13  Trend  Data 


Item 

a 

s.e. (a) 

b 



— — — _ — 

 ^  

185 

2.51673 

0.11362 

0.03787 

186 

1.89709 

0.16323 

0.03287 

187 

1.52011 

0.13599 

0.46874 

188 

1.64724 

0.13332 

0.18187 

193 

1.95877 

0.19860 

-0.67533 

194 

0.98902 

0.19316 

1.45942 

195 

1.75025 

0.22692 

0.62868 

196 

1.17830 

0.10074 

0.31874 

197 

0.95770 

0.0837r 

1.28597 

198 

0.79796 

0.0832 J 

-1.25946 

200 

1.48339 

0.20282 

0.88927 

201 

1.52326 

0.10658 

-1.47179 

203 

1.11234 

0.13766 

0.48069 

204 

0.69822 

0.08559 

-0.63959 

210 

0.98206 

0.14145 

0.79140 

212 

1.37474 

0.13678 

0.18716 

213 

1.02890 

0.21001 

1.21627 

216 

0.91287 

0.09734 

-0.04536 

217 

1.46516 

0.20963 

1.01553 

218 

1.37473 

0.19135 

0.59180 

219 

1.18845 

0.11622 

-0.34421 

236 

1.13623 

0.06069 

-0.70305 

237 

1.51939 

0.07136 

-0.31981 

238 

1.44243 

0.06664 

0.04160 

239 

1.30005 

0.06279 

-0.29244 

268 

0.97429 

0.17236 

-0.29744 

281 

0.918/8 

0.07311 

0.30239 

282 

0.75264 

0.05448 

-0.48151 

288 

0.53747 

0.11100 

1.35141 

289 

1.05311 

0.13811 

0.95220 

291 

1.00488 

0.09706 

-0.85783 

292 

0.74250 

0.09378 

-0.59024 

293 

1.05359 

0.06176 

-0.95351 

294 

0.90762 

0.10346 

-1.11274 

310 

2.26877 

0.12737 

-0.13535 

311 

2.28012 

0.15090 

-0.19776 

312 

1.21856 

0.07243 

-0.89960 

314 

1.13782 

0.12077 

0.36537 

315 

0.87611 

0.08706 

0.53376 

316 

0.92071 

0.12754 

0.13563 

317 

1.47172 

0.13992 

-0.38777 

318 

1.48431 

0.11809 

-1.10771 

319 

1.75072 

0.14255 

-1.12547 

s.e. (b) 

0.02811 

0.05674 

0.07497 

0.05443 

0.11440 

0.30495 

0.13533 

0.05971 

0.12385 

0.15231 

0.15228 

0.13543 

0.09308 

0.10917 

0.13379 

0.05211 

0.27989 

0.05615 

0.19436 

0.12494 

0.06736 

0.05228 

0.03408 

0.02351 

0.03160 

0.12307 

0.05184 

0.06002 

0.28754 

0.15262 

0.11279 

0.12044 

0.07054 

0.15293 

0.03868 

0.04774 

0.07272 

0.08086 

0.07536 

0.07046 

0.06973 

0.12291 

0.14035 


0.00000 
0.31036 
0.19272 
0.21692 
0.19247 
0.11026 
0.21511 
0.17701 
0.16081 
0.13215 
0.10267 
0.20132 
0.15485 
0.21316 
0.11979 
0.10984 
0.18100 
0.12865 
0.14433 
0.22094 
0.13454 
0.00000 
0.00000 
0.00000 
0.00000 
0.52631 
0.15225 
0.17341 
0.35969 
0.30673 
,38101 
.45876 
.00000 
,?1196 
.26263 
.38496 
.20930 
0.23563 
0.14318 
0.22264 
0.1555c 
C. 00000 
0.00000 


0. 
0. 

0. 
0. 
0. 
0. 


s.e. (c) 

0.00000 
0.02830 
0.02275 
0.02573 
0.04197 
0.02332 
0.02552 
0.02896 
0.01401 
0.04615 
0.01875 
0.05141 
0.03494 
0.05558 
0.03070 
0.02592 
0.02932 
0.03981 
0.02279 
.03C70 
.04031 
.00000 
.00000 
.00000 
.00000 
0.05551 
0.03035 
0.04352 
0.03905 
0.02370 
.05690 
.06141 
.00000 
.05619 
.02243 
.02543 
.04/33 
03112 
0.03156 
0.04453 
0.03578 
0.00000 
0.00000 


0. 
0. 
0. 
0. 
C. 
0. 


0. 

0. 
0. 
0. 
0. 
0. 
0. 
0. 


701 

753 


Table  B-3 
(continued) 


Item  Parameter  Esti'^'tes  and  Standard  Errors 
Age  13  .  -end  Data 


Item 

320 

321 

322 

323 

324 

325 

3A2 

345 

346 

347 

348 

349 

350 

351 

353 

357 

358 

359 

362 

364 

367 

371 

375 

377 

380 

385 

38') 

404 

405 

406 

407 

408 

417 

418 

419 

433 

434 

444 

447 

448 

449 

450 

463 


a 

1.77939 

1.60588 

1.22556 

1.46627 

1.17863 

1.47065 

1.57489 

2.16828 

0.52442 

1. 226^3 

1.361/8 

0.94975 

0.49219 

1.42787 

1.42809 

1.25807 

1.14876 

1.28674 

1.14652 

0.66489 

1.37317 

1.41620 

1.18090 

0.75895 

1.78845 

1.17292 

1.55764 

0.53089 

1.65257 

1.81441 

1.47959 

1.36800 

1.59288 

1.16827 

1.27557 

1.42205 

1.23817 

1.75023 

1.32592 

1.56265 

i. 00050 

1.22838 

1 . 17000 


s.e.(a) 

0.12727 

0.10477 

0.09085 

0.10435 

0.08565 

0.10204 

0.10997 

0.20660 

0.14425 

0.17010 

0.18338 

0.12726 

0.05045 

0.26556 

0.14141 

0.12921 

0.07895 

0.14502 

0.17208 

0.11928 

0.16554 

0.07708 

0.07538 

0.06892 

0.20040 

0.08648 

0.09352 

0.05754 

0.12072 

0.17436 

0.13391 

0.10171 

0.19539 

0.12611 

0.15006 

0.15314 

0.13747 

0.19138 

0.19449 

0.17695 

0.11600 

0.13124 

0.10417 


-0.72461 
-0.10657 
-0.77735 
-0.72216 
-0.27795 
0.11847 
-1.29250 
1.76516 
2.36894 
0.63777 
-1.75266 
1.30400 
0.08122 
2.22290 
-0.99913 
-1.20208 
-0.26995 
-1.55190 
-0.62576 
1.05190 
0.52966 
-0.64684 
0.18255 
0.2773.*^ 
-0.41334 
0.19274 
-0.62054 
-1.12245 
-0.63217 
-0.51630 
-0.09604 
-0.13247 
-1.96355 
-0.37348 
-1.49448 
-C. 22194 
-0.24703 
-0.51518 
0.51705 
-0.82890 
-0.92096 
-0.64247 
-0.51312 


s.e. (b) 

0.08745 

0.03790 

0.08055 

0.07874 

0.04418 

0.03278 

0.12075 

0.25481 

0.64336 

0.12459 

0.29022 

0.19120 

0.05370 

0.48163 

0.13381 

0.15663 

0.05181 

0.21743 

0.14206 

0.19680 

0.10041 

0.05370 

0.04193 

0.05770 

0.09078 

0.04600 

0.05889 

0.14254 

0.07401 

0.08919 

0.05581 

0.04730 

0.31300 

0.080/9 

0.21595 

0.07104 

0.07427 

0.09747 

0.12671 

0.13435 

0.13551 

0.10120 

0.07779 


0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.20590 

0.11550 

0.13807 

0.21836 

0.21668 

0.15938 

0.14889 

0.10784 

0.19336 

0.19510 

0.26647 

0.22450 

0.45184 

0.13931 

0.17940 

0.11312 

0.18640 

0.17556 

0.31159 

0.09796 

0.18382 

0.21612 

0.18340 

0.18921 

0.15172 

0.16115 

0.21223 

0.22413 

0.20^96 

0.19747 

0.204j0 

0.21248 

0.26183 

0.22408 

0.20928 

0.19097 

0.26692 


0. 
0. 
0. 
0. 
0. 
0. 
0. 


s.e. (c) 

0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
.00000 
.04987 
.00644 
.03381 
.03144 
.05770 
.02104 
0.04731 
0.00734 
0.04723 
0.05099 
0.03850 
0.05816 
0.06434 
0.04052 
0.02892 
0.03153 
0.02509 
0.03810 
0.04444 
0.02650 
0.03561 
0.05709 
.04005 
.04286 
.03400 
.03350 
0.05709 
0.05091 
0.05513 
0.04458 
0.04824 
.04708 
.03756 
.05419 
.05465 
0.04887 
0.05162 


0. 
0. 
0. 
0. 


0. 

0. 
0. 
0. 


702 


ERIC 


754 


Table  B-3 
(continued) 


Item  Farameter  Estimates  and  Standard  Errors 
Age  13  Trend  Data 


Item 

a 

4o4 

l.OOJJO 

40J 

J . 10/75 

1 . 14oU4 

LCI 

U. 82003 

«fOO 

i.ZoyJO 

1 .OooOl 

4^0 

1 . JoO/Z 

1  oR^m 
1 .03/0/ 

Oj4 

1    D 1  '5  /•  7 

1 .4oooo 

OJO 

0*  ol jjI 

bit 

1     A  A  £  O  O 

OJO 

0. 65209 

oJV 

O.yOOol 

o4U 

1.23404 

1 

1 • 1210/ 

642 

1.44388 

643 

1.82313 

644 

1.34890 

645 

2.03424 

646 

2.59135 

647 

1.10231 

648 

0.31726 

649 

0.67975 

650 

1.32778 

651 

0.60083 

652 

1.07488 

654 

1.09902 

655 

1.57269 

s.e. (a) 

0.15700 
0.09800 
0.10265 
0.07979 
0.09393 
0.12819 
0.15302 
0.18919 
0.20624 
0.17948 
0.12897 
0.34546 
0.21529 
0.10987 
0.13452 
0.11965 
0.23487 
0.20953 
0.20456 
0.23389 
0.29699 
0.16135 
0.05983 
0.21187 
0.16397 
0.08678 
0.15769 
12947 
0.20306 


0. 
-0. 

0. 

0. 
-0. 
-0. 
-0. 


.37014 
.04452 
.00016 
.01609 
.03405 
, 10404 
,71118 
-0.39627 
-0.08198 
0.08281 
-2.98323 
1.81572 
2.61415 
-0.43724 
-0.48771 
-0.51220 
0.65943 
-0.94115 
-1.77797 
-0.93154 
-0.35925 
0.(»6980 
-1.32120 
2.20758 
0.05762 
-0.28677 
0.73201 
-0.21836 
-1.44357 


s.e. (b) 

0.07753 
0.05580 
0.05624 
0.05747 
0.04376 
0.07027 
0.11365 
0.08170 
0.07064 
0.07649 
0.65580 
0.53670 
0.87036 
0.09253 
0.08955 
0.08751 
0.16725 
0.16121 
0.32609 
0.17113 
0.09686 
0.09109 
0.27390 
0.69635 
0.07431 
0.08693 
0.13547 
0.07480 
0.24/32 


0.25551 
0.20504 
0.20855 
0.19535 
0.14227 
0.20781 
0.21254 
0.18230 
0.23088 
0.23556 
0.22377 
0.20173 
0.23101 
0.22134 
0.20829 
0.18216 
0.28037 
0.20699 
0.21545 
0.19353 
0.24035 
0.30094 
0.14976 
0.24745 
0.23338 
0.21222 
0.14448 
0.21328 
0.21109 


s.e.(c) 

0.02538 
0.03989 
0.03888 
0.04345 
0.02976 
0.04788 
0.05295 
0.04204 
0.04016 
J. 04182 
0.06017 
0.01776 
0.02888 
0.05567 
0.05065 
.04726 
.03470 
.05223 
.05766 
0.04953 
0.04191 
0.05370 
0.05223 
0.03272 
0.04415 
0.05546 
0.03389 
0.04952 
0.05599 


0. 
0. 
0. 
0. 


703 


755 


Table  B-4 


Item  Parameter  Estimates  and  Standard  Errors 
Age  17  Trend  Data 


Item 

a 

s.e. (a) 



5 

^  ^ 

0.59432 

0.10084 

6 

0.87132 

0.06936 

10 

1.18014 

0.08729 

11 

0.64668 

0.07597 

12 

1.98049 

0.12169 

13 

1.02203 

0.07951 

U 

0.68562 

0.06332 

16 

0.97125 

0.06572 

17 

1.28363 

0.08264 

19 

0.53958 

0.03925 

20 

1.63945 

0.25431 

21 

1.36369 

0.19198 

22 

0.74009 

0.09171 

AS 

0.87526 

0.10303 

49 

1.29795 

0.14094 

50 

1.05280 

0.1:257 

52 

0.81997 

0.10518 

53 

1.62017 

0.18426 

54 

0.82743 

0.09986 

57 

0.84634 

0.11760 

58 

0.99274 

0.22421 

59 

0.55757 

0.17453 

62 

1.53129 

0.16289 

63 

2.35399 

0.25031 

64 

3.22380 

0.42294 

65 

0.88214 

0.06660 

67 

1.10620 

0.10584 

69 

1.02922 

0.09581 

70 

0.44947 

0.04969 

71 

0.65227 

0.05710 

72 

0.97057 

0.07134 

73 

1.16801 

0.09325 

74 

0.75477 

0.06220 

75 

1.30843 

0.09016 

94 

0.81195 

0.04418 

95 

0.46229 

0.05794 

96 

0.84699 

0.10985 

107 

0.88547 

0.05288 

108 

0.67852 

0.08229 

109 

1.41434 

0.17188 

110 

1.17945 

0.17302 

113 

0.95333 

0.10833 

114 

1.22417 

0.11490 

115 

0.84714 

0.09042 

s.e.(b) 


0.24954 
-0.40147 
0.30022 
0.10939 
1.04973 
0.99909 
1.55313 
-1.59551 
-0.34538 
-0.76418 
-1.52437 
-1.51515 
-1.24172 
-0.87556 
-0.12048 
-0.42377 
0.48152 
0.45846 
0.43225 
1.17963 
2.28873 
2.65016 
1.06191 
0.8132/ 
0.88946 
0.00287 
-1.83938 
-1.22016 
O.54Q68 
0.20389 
0.41072 
-0.62411 
-0.08259 
0.05606 
-1.40886 
-1.13349 
-1.82007 
-0.65315 
-0.18683 
0.97126 
1.08039 
0.50002 
-0.34576 
-1.07396 


0. 
0. 
0. 
0. 
0. 
0. 


0. 
0. 
0. 
0. 


,11481 
.09473 
.05945 
10740 
.07714 
.07758 
0.11583 
0.16424 
0.07810 
0.11087 
0.46578 
0.37365 
0.24129 
0.19122 
,11148 
,13279 
.08869 
.08719 
0.08230 
0.13186 
0.46741 
0.68789 
0.13498 
0.13001 
0.21025 
0-06824 
0.26790 
0.19053 
0.03421 
0.04841 
0.04013 
11847 
.06687 
.05984 
,11973 
.26065 
.33210 
.08583 
11775 
,12507 
15213 
0.07917 
0.11349 
0.19467 


0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 


0.25427 
0.20221 
0.27804 
0.37757 
0.10755 
0.24024 
0.12626 
0.24518 
0.31034 
0.23001 
0.24948 
0.25025 
0.25336 
0.25945 
0.26735 
0.23556 
0.23286 
0.24928 
0.21730 
0.15362 
0.15108 
0.182j7 
0.49287 
0.47558 
0.59051 
0.23844 
0.17537 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
.00000 
.00000 
.23908 
.21025 
.19377 
,14317 
0.23676 
0.21527 
0.22717 
0.19986 
0.14786 
0.19175 


0. 
0. 
0. 
0. 
0. 
0. 


s.e.(c) 

0.05811 
0.04389 
0.03177 
0.05674 
0.01163 
0.02364 
0.02287 
0.05540 
0.03918 
0.05189 
0.05780 
0.05845 
0.05972 
0.05904 
0.05014 
0.05228 
0.04640 
0.03478 
0.04391 
0.03268 
0.02084 
0.03061 
0.01982 
0.02091 
0.01772 
0.03966 
0.05049 
0.00000 
0.00000 
0.00000 
0.00000 
0.00000 
.00000 
.00000 
.05287 
.05941 
.05585 
.03849 
.05346 
0.02901 
0.03149 
0.03966 
0.04077 
0.05418 


0. 
0. 
0. 
0. 
0. 
0. 
0. 


705 


ERIC 


756 


Table  B-4 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 
Age  17  Trend  Data 


Item 

12A 

125 

126 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

146 

147 

148 

151 

152 

153 

156 

162 

163 

164 

166 

167 

168 

169 

170 

171 

174 

176 

177 

180 

181 

182 

183 

184 

185 

186 

187 

188 


0.69228 

1.35283 

1.14087 

1.66698 

0.36123 

1.35691 

0.98272 

1,57677 

1.14654 

1.04763 

0.79883 

0.88706 

0.45010 

0.98092 

0.74808 

0.61987 

1.57044 

1.45125 

1.33127 

1,12274 

1.96260 

0.46243 

0,60844 

1.16963 

0.88216 

1.02472 

1.10947 

1.11519 

0.74494 

0.22835 

0.85246 

0.93405 

1.09899 

1.79719 

0.83532 

0.69930 

1.51833 

1.51280 

2.73854 

2.46922 

1.18504 

1.26249 

1.05025 


s.e. (a) 

0.12098 

0.14904 

0.07039 

0.12274 

0.04104 

0.11314 

0.08258 

0.09791 

0,13888 

0.11700 

0,09200 

0,05596 

0.09018 

0.06374 

0.08066 

0.07859 

0,21814 

0.16767 

0.13468 

0.12332 

0.13277 

0.14180 

0.07624 

0.12727 

0.10423 

0-10894 

0.11192 

0.10319 

0,06133 

0.03620 

0.10091 

0.06458 

0.15720 

0.25008 

0.09349 

0.06199 

0.10459 

0.10451 

0.15404 

0,14869 

0.10607 

0.10020 

0.08945 


0.50749 
-0.95722 
-0.23730 
-0.17206 
-2.48841 
0.51104 
-0.44557 
0.16802 
-1.31181 
-0.78478 
-0.05900 
-0.92831 
1.77775 
0.03659 
0.41074 
1.52425 
0.11486 
-0.63203 
0.27173 
-1.04790 
1.71999 
4.09359 
-0.41443 
-0.07951 
-1.29489 
-0.54436 
-0.48091 
-0.15072 
0.85600 
-1.45780 
1.08234 
0.71370 
1.67135 
1.77935 
0.19459 
1.46190 
-1.17939 
-1.22430 
-0.83845 
-0.87849 
-1.20315 
0.03545 
-0.47839 


s.e.(b) 

0.11257 

0.21429 

0.06786 

0.08357 

0.35356 

0.06978 

0.11038 

0.05176 

0.26686 

0.17508 

0.10488 

0.10928 

0.27284 

0.05851 

0.09176 

0.15434 

0.12772 

0.18246 

0.07635 

0.20798 

0.14176 

1.11079 

0.14489 

0.10308 

0.24418 

0.14116 

0.13101 

0.09229 

0.06342 

0.32761 

0.11147 

0.05491 

0.21015 

0.29740 

0.08482 

0.10658 

0.17023 

0.17530 

0.18271 

0.18119 

0.18815 

0.d7130 

0.10922 


0.30093 

0.16147 

0.19640 

0.18573 

0.27880 

0.41078 

0.42686 

0.21842 

0.23303 

0.21912 

0.23301 

0.25088 

0.23565 

0.19202 

0.49653 

0.24636 

0.49320 

0.25910 

0.17931 

0.19010 

0.06830 

0.28630 

0.25092 

0.21424 

0.26669 

0.19722 

0.19344 

0.14279 

0.13239 

0.20260 

0.24024 

0.22718 

0.10532 

0.20325 

0.18657 

0.21872 

0.00000 

0.00000 

0.00000 

0.00000 

0.25078 

0.24953 

0.24689 


s.e. (c) 

0.05240 

0.04570 

0.03548 

0.03129 

0.06533 

0.02682 

0.04811 

0.02553 

0.05472 

0.05571 

0.05174 

0.05185 

0.03793 

0.03446 

0.04012 

0.03045 

0.04714 

0.05495 

0.03757 

0.05307 

0.00657 

0.02185 

0.05722 

0.04755 

0.06140 

0.05151 

0.04916 

0.03897 

0.03076 

0.05773 

0.03287 

0.02588 

0.01952 

0.01663 

0.04754 

0.02270 

0.00000 

0.00000 

0.00000 

O.OOOOC 

0.05289 

0.03616 

0.04866 


706 


1^ 


757 


Table  B-4 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  17  Trend  Data 


Item 

191 

193 

194 

195 

196 

197 

200 

201 

202 

203 

204 

205 

206 

210 

212 

213 

216 

217 

236 

237 

238 

239 

240 

241 

242 

254 

255 

256 

257 

258 

259 

281 

282 

288 

289 

293 

294 

310 

311 

312 

314 

315 

316 


a 

0.45199 

1.04098 

0.64105 

0.53332 

1.77071 

1 . 29400 

1.90838 

1.21404 

0.96720 

0.63631 

0.47799 

0.71847 

0.85364 

0.62594 

1.11070 

0.91789 

0.49083 


1. 
0. 
1. 
1, 
0. 
0. 
0. 
0. 
0. 
0. 
1. 


.26543 
.69610 
.06995 
, 10482 
.93467 
.80081 
. 59960 
.50169 
.90757 
.92322 
.70086 
2.08256 
1.76546 
1.85470 


,17509 
57491 
66453 
69711 
,00109 
55453 
97597 
1.83938 
1.06266 
0.83053 
0.99330 
0.46904 


1. 
0. 
0. 
0. 
1. 
1. 
1, 


s.e. (a) 

0.06813 

0.11131 

0.12180 

0.08243 

0.14280 

0.06383 

0.17310 

0.08423 

0.12330 

0.08083 

0.06931 

0.07537 

0.09833 

0.08744 

0.11419 

0.10593 

0.06256 

0.16442 

0.04781 

0.06170 

0.05601 

0.05132 

0.06984 

0.06427 

0.05357 

0.05137 

0.09005 

0.12609 

0.17280 

0.11580 

0.12134 

0.09228 

0.04444 

0.08966 

0.07315 

0.09872 

0.24024 

0.10814 

0.10728 

0.06358 

0.06001 

0.07254 

0.07085 


-1.58538 
-0.43962 
1.82739 
0.43129 
0.69601 
1.10911 
0.60''25 
-1.31791 
-1.62836 
0.93656 
-0.41630 
1.09384 
-0.22248 
0.95847 
0.03431 
0.67046 
-0.21556 
1.18444 
-1.19105 
-0.60850 
0.22346 
-0.35216 
-0.70598 
-1.36050 
-0.70475 
-1.33950 
-1.49053 
-0.39853 
-0.55263 
0.13425 
0.19678 
0.42697 
-0.67859 
1.34667 
0.81770 
-2.33080 
-2.09024 
-0.20875 
-0c41236 
-1.07107 
0.11310 
0.63841 
0.21633 


0. 
0. 
0. 
0. 
0. 


s.e. (b) 

0.33402 
0.13478 
0.28054 
0.09750 
0.06980 
0.05357 
0.07112 
0.15585 
31339 
09236 
16504 
09238 
11972 
0.10476 
0. 08^67 
0.07897 
0.12128 
0.15028 
0.12801 
0.08244 
0.03486 
0.06172 
0.12303 
0.21402 
0.14197 
0.11965 
0.22328 
11646 
17302 
0.06243 
0.05989 
0.06140 
0.10987 
0.15538 
0.08427 
0.32026 
0.57388 
0.06681 
0.08265 
11463 
05968 
05335 
10438 


0. 
0. 


0.26496 

0.24884 

0.16678 

0.26520 

0.22798 

0.14971 

0.11621 

0.22761 

0.25683 

0.13972 

0.26466 

0.12780 

0.26638 

0.16588 

0.18187 

0.14137 

0.18109 

0.17113 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.29149 

0.21371 

0.38141 

0.26561 

0.00000 

0.25189 

0.22230 

0.23967 

0.22434 

0.20215 

0.18550 

0.25614 


s.e. (c) 

0.06224 
0.05275 
0.03701 
0.05620 
0.02208 
0.01153 
0.02207 
0.04843 
0.06016 
0.03724 
0.06129 
0.03100 
0.05256 
0.04229 
0.04405 
0.03576 
0.05149 
0.02743 
0.00000 
0.00000 
00000 
00000 
00000 
00000 
,00000 
00000 
00000 
0.00000 
0.00000 
0.00000 
0.00000 
0.02980 
0.04855 
03358 
03784 
00000 
05931 
02815 
03295 
0.04848 
0.03653 
0.02636 
0.05750 


0. 
0. 
0. 
0. 
0. 
0. 
0. 


0. 
0. 
0. 
0. 
0. 
0. 


707 


ERIC 


Table  B-4 
(continued) 


Item  Parameter  Estimates  and  Standard  Errors 
Age  17  Trend  Data 


Item 

317 

318 

319 

320 

321 

322 

323 

32A 

325 

339 

3A5 

3A6 

347 

349 

350 

351 

354 

358 

360 

363 

364 

367 

375 

377 

381 

385 

386 

390 

391 

392 

393 

433 

434 

444 

447 

463 

464 

465 

466 

467 

468 

495 

496 


0.99313 

1.08016 

1.11911 

1.22579 

1.14684 

1.35538 

1.33168 

1.06698 

1.15749 

0.62174 

1.48303 

0.80011 

0.8581V 

1.16557 

0.36541 

1.27585 

0.69296 

1.01825 

0.95496 

1.72319 

0.72257 

0.61159 

1.06069 

0.48717 

1.12524 

1.22842 

1.11436 

1.96132 

1.97781 

2.09415 

1.96949 

0.99114 

1.22338 

1.32800 

1.08191 

1.07046 

1.53342 

2.16259 

1.20625 

0.72010 

1.35830 

0.94583 

1.39056 


s. e. ^a) 

0.12426 

0.10370 

0.12121 

0.09701 

0.07858 

0.12050 

0.12319 

0.08468 

0.07953 

0.07576 

0.10966 

0.12713 

0.11307 

0.08068 

0.03532 

0.08354 

0.10069 

0.06307 

0.15098 

0.16888 

0.05923 

0.04925 

0.06546 

0.03950 

0.12230 

0.09807 

0.08874 

0.'.7034 

0.21314 

0.26210 

0.33926 

0.11028 

0.^*5914 

0.16288 

0.12525 

0.10094 

0.15036 

0.20103 

0.08530 

0.05933 

0.08374 

0.10372 

0.15146 


0.05050 
-1.46409 
-1.66604 
-0.67678 
-0.12357 
-1.03931 
-1.09411 
-0.641C8 
-0.0O904 
-1.51617 
1.47069 
1.43191 
0.00737 
1.05905 
0.59877 
1.60148 
0.75584 
-0.23213 
-2.17454 
-0.02255 
0.77986 
0.30881 
0.13339 
-0.23439 
-0.13496 
0.18844 
-0.45880 
0.83773 
0.66570 
0.55245 
1.45088 
-0.33099 
-0.24237 
-0.92151 
-0.45185 
-0.50005 
0.84850 
0.46049 
0.06940 
0.28322 
0.09397 
-0.14280 
-0.51094 


s.e. (b) 

0.10744 

0.23019 

0.28882 

0.12927 

0.06981 

0.18702 

0.19858 

C. 11602 

0.06058 

0.26507 

0.10829 

0.18972 

0.11694 

0.07224 

0.05512 

0.10023 

0.10257 

0.06843 

0.47425 

0.09748 

0.06076 

0.05634 

0.05284 

0.08782 

0.10626 

0.06306 

0.10243 

0.16045 

0.09249 

0.12213 

0.33080 

0.12683 

0.12320 

0.22347 

0.14587 

0.12829 

0.10722 

0.07854 

0.06193 

0.06108 

0.05236 

0.10995 

0.15331 


0.28210 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.18479 

0.07820 

0.18799 

0.30546 

0.17184 

0.18704 

0.06598 

0.21238 

0.25487 

0.26545 

0.19835 

0.14067 

0.17399 

0.22482 

0.22231 

0.21654 

0.21048 

0.23585 

0.48969 

0.25281 

0.55618 

0.50612 

0.25532 

0.26516 

0.25819 

0.28064 

0.39462 

0.42746 

0.45312 

0.19592 

0.182  37 

0.12755 

0.24283 

0.23921 


s.e. (c) 

0.05110 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.00000 

0.05327 

0.01010 

0.03564 

0.05666 

0.01921 

0.03891 

0.00967 

0.04809 

0.03836 

0.06258 

0.03719 

0.03097 

0.03693 

0.03233 

0.04993 

0.04634 

0.03479 

0.04671 

0.02547 

0.02494 

0.02999 

0.01939 

0.05669 

0.05549 

0.05934 

0.06060 

0.05974 

0.02443 

0.02449 

0.03371 

0.03798 

0.02514 

0.05509 

0.05392 


708 


75^ 


ERIC 


Table  B-4 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 
Age  17  Trend  Data 


Item 

497 
516 
517 
518 
634 
635 
636 
637 
638 
639 
640 
641 
642 
643 
644 
645 
646 
547 
648 
649 
650 
651 
652 
653 
654 
655 


1 

3, 
1, 
1. 
0. 


.09546 
.31719 
.46078 
.99442 
.85589 
0.95778 
0.44915 
1.42735 
0.83278 
1.06417 
1.04213 
1.12321 
1.01585 
1.45430 
1- 70941 
1.38063 
2.22505 
1.23280 
0.23808 
0.60196 
1  35563 
1.02332 
1.20426 
1.41871 
1.11849 
1.31916 


s.e. (a) 

0.12542 
0.29666 
0.13492 
0.18488 
0.10233 
0.10611 
0.10409 
0.17687 
0.18982 
0.12958 
0.11522 
0.13470 
0.11815 
0.20012 
0.32433 
0.17732 
0.26135 
0.14713 
0.04790 
0.13288 
0.14548 
0.11280 
0.12332 
0.15956 
0.12182 
0.?1214 


-0.67282 
-0.32297 
-0.25521 
0.08683 
-0.65988 
-0.14893 
-3.69624 
1.11024 
1.89726 
-0.19962 
-0.57695 
-0.84345 
0.55728 
-1.12326 
-1.58179 
-1.04481 
-0.33018 
0.22322 
-1.52518 
1.65869 
0.02791 
-0.09762 
0.55375 
0.87244 
-0.36268 
-1.59599 


s.e  (b) 

0.16884 
0.20806 
0.10934 
0.08734 
0.16776 
0.11167 
0.99511 
0.15315 
0.37568 
0.12797 
0.15061 
0.19809 
0.09316 
0.29t^6 
0.58078 
0.25864 
0.18219 
0.10538 
0.42730 
0.29578 
0.10287 
0.10755 
0.07724 
0.10998 
0.13094 
0.42450 


0.25740 
0.14302 
0.18331 
0.13598 
0.10233 
0.25762 
0.27166 
0.22818 
0.2862? 
0.30821 
0. 24451 
0.2'.560 
0.23903 
0.26229 
0.25611 
0.24647 
0.26073 
0.33978 
0.20636 
0.28403 
0.27185 
0.26029 
0.17273 
0.21576 
0.26608 
0.26037 


s.e. (c) 

0.05886 
0.02464 
0.03473 
0.02394 
-0.65988 
0.05682 
0.06395 
0.03079 
0.03509 
0.06005 
0.051)78 
0.06046 
0.04710 
0.06082 
0.06041 
0.05768 
0.05078 
0.05467 
0.05911 
0.04757 
0.05205 
0.05557 
0.03877 
0.03286 
0.05827 
0.06143 


709 


760 


Table  B-5 
IRT  TREND  ANRLYSIS  ITEM  TABLE  FOR  A6E  9 


\  YEAR  1 

2 

2 

2 

2 

2 

2 

2 

2  1 

6 

6 

6  1 

11 

11 

11  11   11  11 

15  15 

15 

IS 

1  HO. 

\PflOW6EI 

1 

2 

3 

4 

S 

6 

7 

9  1 

I 

2 

3 

1 

2 

3    4   10   11  1 

1  2 

3 

4 

IVEMS 

POS     ECS  ID         NREP  ID\  1 

\m 

4  7O9900e-«00e/OO2  N001602  1 

• 

• 

• 

• 

• 

14 

• 

• 

• 

13 

•  1 

• 

13 

•         •  • 

•  • 

• 

1  4 

7  7W9O0W0OI/OOI  1 

7 

1  \ 

8  7W90(»-fl00e/OO2  1 

7 

!  1 

» 7W9005-«003/003  1 

7 

1  \ 

10  70nOO(-A001/OOt  1 

M 

1  ? 

11  7099006-A002/002  1 

1  3 

IS  7onooa-«ooi/ooi  i 

• 

7 

1  \ 

la  7OT90ll-«Ol/001  1 

1  1 

fiO  7(»90e*-«001/001  1 

3 

• 

• 

3 

2    •     •  • 

1  3 

tt  7101007-MOl/OOl  1 

10 

1  1 

£7  7101009-MOl/OOl  1 

? 

1  ? 

7&  7101085H»01/001  NOIMlOl  1 

22 

19 

22 

19 

18    •     •  • 

1  4 

77  71O1OS&-A0O1/OO1  N014001  1 

• 

\P 

8 

• 

9 

• 

• 

•  14 

• 

• 

1  4 

71  7101067H)001/001  1 

1  \ 

79  7101051-flOOl/OOl  1 

1  \ 

80  710109HM01/001  N009101  1 

19 

• 

1  4 

81  7101080-0001/001  1 

•  1 

82  710l081-flOO!/001  1 

10 

I  \ 

83  7101062-flOOl/OOl  1 

u 

1  \ 

8«  71O10U-O001/OO1  1 

5 

>  } 

85  7101064-<»01/001  1 

1  1 

88  7102001-AOOl/OOl  1 

• 

• 

• 

• 

• 

19 

• 

• 

!9 

1  ? 

87  7102004-A001/001  1 

IS 

• 

• 

• 

• 

• 

1? 

• 

• 

13 

1  } 

88  7102005-MOl/OOl  1 

1 

1  ^ 

89  710eo06-fl001/001  1 

•  » 

90  7102007-flOOl/OOl  1 

1  \ 

91  7102008-fl001/00l  1 

>  } 

92  7102010-flOOl/C^l  1 

• 

• 

• 

• 

5 

• 

• 

• 

S 

1  ? 

93  710eOU-i)001/001  1 

i? 

11    •     •  • 

1  ? 

94  7102013-MOl/OOl  1 

1  20 

17 

17 

?o 

17 

16         •  • 

1  2 

97  7102029-0001/001  1 

2 

•  ? 

98  710203(HM01/001  1 

• 

n 

•  J_ 

99  7102031-flOOl/OOl  1 

101  7102032-A002/002  1 

102  7102032-A003/003  1 

1  ? 

103  7102032-M04/004  1 

I  ? 

104  7102032-A005/005  1 

I  ? 

108  7102034-ftOOl/OOl  1 

• 

1  1 

107  71O2035-A0Ol/<y)l  1 

1  } 

108  710203&-A001/O0t  1 

|4 

1  1 

112  710300e-A001/001  1 

• 

I 

• 

• 

1  n 

113  7103C04-A001/001  1 

• 

'_L- 

lis  7103017-MOl/OOl  1 

to 

1  \ 

120  7103026-A001/G01  1 

t  t 

121  7103027-AOOl/OOl  1 

1  \ 

1^2  710302»W»01/00I  1 

1  1 

123  7103029-flOOl/OOl  1 

I  \ 

124  7103030^)001/001  1 

1  1 

•27  7103033-flOOl/OOl  1 

I  \ 

ERIC 


711 


761 


Table  B-5 
IRT  TREND  flNPLVSIS  ITEM  TflBL£  FOR  AGE  9 


\  YEAR  1 
\PACKA6EI 

P05     EtS  ID         NREP  ID\  1 

2 
1 

2 

2 

2 
3 

2 
4 

2 
5 

2 
6 

2 
7 

2  1 
9  1 

6 

1 

6 
2 

6  1  11 
3  1  1 
1 

11 
2 

11  11  11  11 
3    4  10  11 

15  IS  IS  IS  1  NO. 
12    3    4  lYEARS 
1                   1  USED 

i2E  7103034-AOOl/OOl  L 

• 

6  1 

•  1  • 

•        •         •  • 

1   •    •     •  -11 

130  7103036-flOOl/OOl  L 

2 

4  1  • 

3    •     •  • 

1   •    •     •  -13 

149  7103047-AOOl/OOl  1 

IS 

'  1  ' 

*         *  * 

1   •    •     •  "11 

150  7103048-0001/001  L 

12 

9  1  • 

8    •     •  • 

1   •     •     •  '13 

157  7lO3053-fl0Ol/0Ol  L 

13 

•  1  • 

1   •     •     •  '11 

159  71O309S-A0Oi/OOl  1 

•  1  • 

• 

1   •     •     •  '11 

160  71O30SH>001/0Ol  L 

•  1  • 

*  * 

1   •     •     *  '11 

m  7103057-flOOl/OOl  L 

IS 

•  1  • 

*  * 

1    •     •     •  '11 

m  7103062-«)01/001  L 

12 

•  1  • 

*         *  * 

1    •     •     •  "11 

167  7103062-fl002/002  [ 

12 

•  1  • 

*  * 

\   •     •     •  "11 

16fl  7l03062-fl003/003  [ 

12 

•  1  • 

•         •         *  * 

1   •     •     •  "11 

169  7127001-flOOl/OOl  NOOSaOl  1 

14  I 

16  1  - 

15    •     •  • 

1   •     •     •     i  1  4 

170  7127001-A002/O0a  1 

14  1 

16  1  - 

IS   •    •  • 

1   •     •     •  13 

171  7127001-fl003/003  1 

14  1 

16  1  - 

15    •     •  • 

1   •     •     •  '13 

174  7127003-0001/001  NOOglOl  1 

11 

IS 

•  1  IS 

1   •          •     5  14 

175  7127003-0002/002  1 

11 

IS 

-  1  IS 

1    •     •     •  "13 

178  7127005-flOOl/OOl  ! 

6 

•  1  • 

1    •     •     •  "11 

189  7201002-flOOl/OOl  1 

1 

•  1  • 

1    •     •     •  '11 

iqa  7POiM3-aooi/ooi  I 

S 

•  1  • 

1    •     •     •  "11 

193  7201023-0001/001  1    14        '         '         '  !  !  !  LLJ  Li—;  '.  ;  ;  ;  ;  ;  .  j 

196  7202003-0001/001  1   '     '     '     '    7  ;  ! — LLJ  '. — U— | — '.  ; — ;  ;  ^-j— ; — :  : — .  '  ' 

14*7903002-0001/001              1    '     '     3     '     '     '     '         1   !  U  !  !  !  !— ^ 

104  7PO30a3-a001/001  1 

8 

•  1  • 

1    •     •     •  '11 

MA  7203006-0001 /OOl              1    "     '     '     '14     '  '  U  !  !  U  '.  '.  !  

205  7203012-0001/001 

7 

6  1  - 

s  •   •  • 

1   •     •     •  "13 

207  7203043-0001/001  

3 

•  1  • 

1    •     •     •  "11 

208  7203044-0001/001 

13 

-  1  6 

1    •     •     •  "13 

214  7203060-0001/001  

9 

•  1  • 

1   •     •     •  "11 

91ft  7903059-0001/001 

11 

•  1  • 

1    •     •     •  "11 

9iq  7997001-0001/001              1    '     '     7     '     '     '     '  U  !  !  L!  !  '.  '.  '■  !  ! — - — 

aO  7301002-0001/001  

3 

•  1  • 

1   •     •     •  "11 

221  7301004-0001/001  N009601 

•  1  3 

18-     •  -14 

999  7101006-0001/001 

7 

•  1  • 

1    •     •     •  "11 

993  7301007-0001/001              1    '     '     '     '     8     '  '_        1   !  L!  !  !  !  !— ^ 

224  7301011-0001/001  

16 

11 

•  1  • 

11 

1    •     •     •  '13 

225  7301012-0001/001  

13 

14  1  • 

13    •     •  • 

1    •     •     •  '13 

243  7301071-0001/001 

18 

1  7 

■  1  7 

1    •     •     •  '13 

244  7301071-0002/002  

18 

1  7 

•  1  7 

1   •     •     •  -13 

245  7301071-0003/003  

18 

1  7 

•  1  7 

1    •     •     •  -13 

246  7302001-0001/001 

1  8 

1  1  • 

1   •     •     •  -13 

247  73oeooe-oooi/oog  

1  17 

•  1  • 

1   •     •     •  "11 

248  7302002-0003/004  

1  17 

•  1  • 

1   •     •     •  "11 

249  7302002-0005/006  

1  17 

•  1  • 

1    •     •     •  "11 

250  7302002-0007/008  

1  17 

•  1  • 

1   •     •     •  -11 

SV*  7.3<tf004-0002/002 

16 

•  1  • 

1    •     •     •  "11 

953  730^005-0001/001          1  '    '    9    '    '   !  '  1   !  !_!_! — '.  !  ^ 

263  7303004-flOOl/OOl 

10 

•  1  • 

1   •     •     •  "11 

^79  7303013-0001/001  

8  1  - 

7    •     •  • 

1   •    •     •  -13 

2^  7401001-0001/001  

1  1 

1  5 

•  1  S 

1    •     •     •  "13 

337  7401003-flOOl/OOl 

8 

•  1  • 

1    •     •     •  -11 

Table  B-5 
IRT  TiiEND  ANflLVSlS  ITEM  TABLE  FOR  AGE  9 


-POS     ECS  ID 


\  YEAR 
\PACKA6E 
NAEP  ID\ 


338  7m005H>001/001 


2    2    2    2    2    2    2    2  16    6    6  i  11   11  11  H  H 

12345679112311234  10 

 I  I 


11  i  IS  IS  IS  IS  I  NO. 
11  i   1    2    3    4  iVEARS 

I  igSED 


I 


I 


I 


i  1 


339  7401007-AOOl/OOl 


340  74010i(H1001/001 


I 


I 


I 


341  7401011-0001/001 


342  7401016-0001/001 


344  74010ea-fl001/001 


17 


347  7401032-0001/001 


13  I 


348  7401066-0001/001 


349  7401067-0001/001  N003001 


I  10 


I  10 


14 


i  1 


i  1 


i  3 


I  3 


I  1 


i  1 


i  1 


i  4 


350  7401067-0002/002  N00300g 


I  10 


i  10 


IS 


I  4 


351  7401067-0003/003  N003003 


16 


I  2 


352  7401068-0001/001 


353  7401069-0001/001 


11  i 


10 


355  7401071-0001/001 


4  i 


14 


356  7401072-0001/001  N013301 

357  7401073-0001/001  


I  17 


I  17 


14 


14 


I  14 


12 


362  7401078-0001/001 


I  9 


•  I  9 


363  74CiO79-O0Ol/OOl 


365  7401061-0001/001 


2  i 


I  1 


i  3 


14 


13 


366  7401082-0001/001 


370  7401086-0001/001 


15  i 


371  7402020-0001/001  N002401 


9  i 


13  i 


12 


i  4 


372  7402021-0001/001  N009g01 


•  14 


i  16 


I  16 


28 


I  4 


373  7402082-0001/00! 


374  7408023-0001/001 


I  2 


i  2 


i  3 


I  2 


375  7403007-0001/001  N004901 
377  7403019-0001/001  NOOgSdl 


12 


17 


378  7508012-^1/001 


379  7503001-0001/001 


14 


13 


385  7503044-0001/001 


7  I 


367  H801000-0001/00i 


16  I 


388  H801000-0002/00g 


10 


10 


I  4 


i  1 


i  1 


i  3 


i  1 


i  1 


i  1 


363  H201000-0003/003  N00flfi03 


10 


17 


I  2 


39S  H2050OO-O001/OO1 


399  H205000-fl002/002  N010502 


9  I 


9  I 


I  1 


i  2 


400  HS05000-fl003/003  N010503 

401  H20S00O-fl004/004  N010504 


9  i 


9  i 


i  2 


I  2 


402  Hg06000-0001/001  I«01i301 


21 


I  2 


403  H206000-0002/002  HOI  1302 


404  H22200O-O00i/001  N001601 


22 


405  H82a000-0002/002  N00160e 


10 


406  H222000-O003/OO3 


10 


407  H22200O-O0O4/0O4 


10 


408  H82200OWW05/0O5  N001fi04 


10 


417  H24100O-O001/OO1  II0O44O1 


10 


11 


12 


I  2 


i  2 


i  1 


I  1 
I  2 


418  H241000-fl002/00g 


9  I  2 


419  H24100O-O0O3/OO3 


433  H26500O-O001/0O1  N002001 


i  1 


i  1 


434  Hg65000-fl002/002  NOOgQQg 


435  H265000-0003/003 


10 
11 


I  2 


I  2 


I  1 


ERIC 


713 


763 


Table  B-5 
IRT  TREND  MMLYSIS  ITEH  TABLE  FOR  AGE  9 


\  VEAR  1 

2  2 

2 

2 

2 

2 

2 

2  1 

6  6 

6  1 

11  11 

11  11 

11 

11 

15 

15 

It' 

15 

1  NO. 

\MQMBEI 

1  2 

3 

4 

5 

6 

7 

9  1 

1  2 

3  1 

1  2 

3  4 

10 

11 

1 

2 

3 

4 

i YEARS 

POS     ECS  10         HREP  I0\  1 

1  USED 

438  Ha6800(H)001/001  N013201  1 

17 

439  tCfi9O0(H)002/O0e  NOIOIOS  1 

10 

• 

1  1 

440  K86900(HI003/003  N010103  1 

11 

1  1 

441  HSaZOOO-AMl/OOl  1 

442  iefl2ooo-Aooe/ooe  i 

443  ieB200(HKW3/003  1 

448  H2aG00(H»01/OOl  N004701  1 

15 

• 

1  1 

449  H2a60<NH)0O2/O0e  N00470e  1 

16 

,  1 

450  H28600(H1003/003  N0047Q3  1 

17 

1  1 

451  H287000-A001/001  1 

3  1 

1  1 

45^  HS87000-fl002/OOe  N013S02  1 

3  1 

g 

1  2 

469  .404000-MOi/OOl  M013101  1 

1 

1  2 

470  H4O400O-A002/0Q2  N013102  1 

2 

1  2 

4?1  H404000*fl003/003  N013103  1 

3 

1  2 

472  H404000*fl004/004  1 

473  H4O900O-A001/0O1  1 

1  1 

474  H405000-A002/O0e  1 

1  1 

475  H405000-fl003/003  N001S03  1 

3 

476  mOS00O-a0O4/OO4  1 

1  1 

500  H41fiO0(H)0Q3/0O3  N010003  1 

7 

,  1 

501  H417000-A001/001  1 

1  1 

502  H417000-A002/002  1 

1  1 

503  H4i7000-A003/003  1 

1  I 

504  H417000-A004/004  1 

1  1 

505  H4ia000-0001/001  1 

1  1 

506  H418O0O-A002/OOe  1 

1  { 

507  H418000-M03/003  1 

1  1 

508  H419000-M01/001  1 

12  1 

1  1 

509  H419000-M02/002  1 

12  1 

1  1 

510  H419000-fl003/003  1 

12  1 

1  1 

511  H41900O-A004/0O4  1 

12  1 

1  J 

512  H419000-fl005/006  1 

12  1 

1  J 

513  H422000-A00i/O01  1 

5  1 

1  1 

514  H422000-A002/002  1 

5  1 

515  H4e2000-fl003/003  1 

5  1 

1  1 

530  H463000-fl001/001 

1  1 

531  H463000-A002/002  1 

1  1 

532  H463000-M)03/003  1 

1  I 

526  H^68000-fl001/001  N010801  1 

•  • 

7  1 

• 

29 

• 

537  H468000-A00e/O02  1 

•  • 

• 

7  1 

• 

1  t 

636                     N005101  1 

1  t 

640                     N002003  1 

1  1 

643                     N004fl01  1 

19 

1  ) 

S47                     N001603  f 

10 

• 

1  I 

648                     N003a02  1 

3 

649                     N003B03  1 

4 

650                     N004201  1 

12 

651                      M04202  1 

13 

652                     N002102  1 

6 

654                      N004402  1 

10 

714 


Table  B-5 
IRT  TREND  flMLYSIS  ITEM  TABLE  FDR  AGE  9 


\  YEAR    12    2    2    2    2    2    2    2  16    6    6  i  11  11  11  11  11   11  <  IS  15  15  15  i  NO. 
\PflCKAEEI   1    234567911    2311234  10  11  II    234  lYEARS 
POS     ECS  ID         NflEP  IPX  I  !  !  [  I  USED 


655 

6S6 

N013104   1 1 

• 

•     •  1  1 

fiS7 

• 

•     '  1  1 

6SB 

NOlllOl   1 1 

'     •  1  1 

659 

NOlOSOl    1 1 

6S0 

661 

662 

N001501   1 1 

1  11 

663 

2*11 

664 

4*11 

665 

Noioaol  1  1 

20    -  1  1 

666 

NUMBER  OF  CALIBRATED  I  II  II 

17  19  16  15   17  12  14   15  I  20  20  IB  I  20  16  17  17  20  16  I  14  14  19  15  I 


NIMER  OF  CALIBRATED  ITEMS  1 
LIMdNG  BOOKLET  ACROSS  YEARS  1 

6 

6    6  6 

1 

S    B    6    6  1  20  20 

1 

IB  1  20 

16 

17    7  6 

1 

5  1  5 

1 

B  13    6  1 

NUMBER  OF  CALIBRATED  ITEMS  1 
LINKING  BOOOn  UITHIN  YEARS  1 

0 

0    0  0 

1 

0    0    0    0  12  2 

1 

2  1  2 

2 

2    0  0 

1 

0  1  0 

1 

0    0    0  1 

715 


765 


Table  B-6 
IRT  TREND  ANALYSIS  ITEM  TABLE  FOR  AGE  13 


\  VEAR  1 

2 

2 

2 

2 

2  2 

2 

2  1 

6 

6 

6  1 

11 

11 

11  11 

15 

15 

15  15 

1  NO. 

\PACKAeEI 

1 

2 

3 

4 

5  11 

12 

13  1 

1 

2 

3  1 

1 

2 

3  14 

1 

2 

3  4 

lYEARS 

POS     ECS  ID          NAEP  ID\  1 

1  USED 

1  70990O1-M01/0O1  1 

• 

3  1 

17 

•  1 

17 

• 

1  3 

2  7099001-M02/003  1 

• 

3  1 

17 

•  1 

17 

•  • 

1  3 

4  ;099002-A002/002  14001802  1 

8 

7  1 

8  • 

•  21 

1  4 

5  7099003-MOl/OOl  1 

• 

•  1 

•  • 

1  1 

6  709900H»01/001  1 

• 

'  12 

12 

•  1 

13 

•  • 

1  3 

10  7099006-flOOl/OOl  1 

• 

7 

2  1 

2  • 

1  3 

11  7099006-fl002/00e  1 

• 

7 

2  1 

2  • 

1  3 

12  7099007-flOOl/OOl  1 

13 

5  1 

5  • 

1  3 

\n  7099007-A002/002  N005002  1 

13 

S  1 

5  • 

7 

1  4 

U  7099007HM)03/003  N005003  1 

13 

5  1 

5  • 

8 

1  4 

16  7099009-flOOl/OOl  N003601  1 

13 

14 

13 

1  3 

17  7099009-fl002/002  N003602  1 

13 

14 

1  14 

1  3 

19  7099012-flOOl/OOl  N003S01  1 

2 

2 

10 

1  3 

20  7099013-flOOl/OOl  1 

4 

1  1 

21  7099013-M02/002  1 

4 

1  1 

22  7099013-A003/003  1 

4 

1  1 

SI  7099021-MOl/OOl  1 

9 

1  1 

S2  7099021-fl002/002  1 

9 

1  1 

S3  7099021-A003/003  1 

9 

1  1 

54  7099021-M04/004  1 

9 

S5  7099022-AOOl/OOl  1 

•  13 

56  7099022-fl00e/002  1 

-  13 

S7  7039023-AOOl/OOl  1 

6 

58  7099023-A002/002  1 

6 

59  7099023-A003/003  1 

6 

61  7099025-flOOt/OOt              11 1  j 

6S  7099027-AOOl/OOl  1 

11 

66  7101007-flOOl/OOl  1 

9 

69  7101019-M01/002  1 

18 

70  7101019-A003/004  1 

18 

71  7101019-A005/006  1 

18 

72  7101019-A007/008  1 

18 

73  7101019-A009/010  1 

18 

74  7101019-fl011/012  1 

18 

75  7101019-M13/014  1 

18 

76  7101055HW01/001  N004101  1 

1  20 

20 

17 

1  20 

20 

16  ' 

•  22 

1  3 

92  7102010-flOOl/OOl  1 

4 

4  • 

1  3 

94  7102013-flOOl/OOl  1 

1  18 

18 

15 

1  18 

18 

14  • 

1  3 

96  710£01S-fl001/001  1 

8 

98  7102030-flOOl/OOl  1 

1 

99  7102031-flOOl/OOl  1 

IS 

106  7102034-flOOl/OOl  1 

13 

111  7102038-flOOl/OOl  1 

•  4 

113  7103004-flOOl/OOl  1 

•  2 

lt6  7i03019-A001/001  1 

11 

1  13 

1  14 

1  3 

US  7103021-flOOl/OOl  1 

6 

1  11 

1  12 

1  3 

119  710302S-fl001/001  1 

1 

1  1 

1  1 

1  3 

121  7103027-flOOl/OOl  1 

5 

« 

1  1 

124  7103030-A001/001  1 

2 

1  1 

o 

ERJC 


717 

766 


Table  B-6 
IRT  TREND  ANALYSIS  ITEM  TABLE  FOR  AGE  13 


\  YEAR  1 

2 

2 

2 

2 

2 

2 

2 

2 

6 

6 

6 

11 

11 

11 

11 

15  15 

15 

15 

i  NO. 

\PACKA6EI 

1 

2 

3 

4 

5 

11 

12 

13 

1 

2 

3 

1 

2 

3 

14  1 

1  2 

3 

4 

1  YEARS 

POS     ECS  10          NAEP  I0\  1 

1  USED 

126  7103032-flOOl/OOl  1 

. 

1 

1  1 

131  7103037-AOOl/OOl  1 

4 

1  1 

132  7103038-AOOl/OOl  1 

. 

1  3 

133  7103039-AOOl/OOl  1 

■ 

7 

g 

1  2 

134  7103041WW01/001  1 

9 

3 

3 

1  3 

135  7103M1-A002/002  1 

9 

3 

3 

1  3 

136  7103M1-A003/003  1 

9 

3 

3 

1  3 

137  7103(M1-A004/004  1 

9 

3 

3 

1  3 

136  7lC3042-fl001/001  1 

4 

• 

• 

1  1 

139  7103(M2H»02/002  1 

4 

• 

• 

1  1 

140  7103043-flOOl/OOl  1 

9 

• 

• 

1  1 

141  7103044-AOOl/OOl  tW01701  1 

11 

10 

11 

1  4 

143  7103044-A003/003  1 

11 

10 

11 

1  3 

144  710304S-A001/001  N005201  1 

10 

11 

23 

1  3 

146  710304S-A003/003  N005203  1 

10 

11 

25 

1  3 

ISl  7103049-AOOl/OOl  1 

6 

152  7103049-A0O2/002  1 

6 

154  7103051-AOOl/OOl  1 

5 

1S5  7103051-A00e/O02  1 

5 

1S6  71030S2-A001/001  1 

7 

161  7103057-flOOl/OOl             18                                                                       |....|  , 

166  7103062-AOOl/OOl  1 

14 

167  710306e-A002/002  1 

14  1 

168  7i03062-fl003/003  1 

U 

169  7127001-AOOl/OOl  N003e01  1 

13 

10 

11 

1  4 

170  7127001-A002/002  1 

13 

10 

11 

1  3 

171  7127001-fl003/003  1 

13 

10 

11 

1  3 

172  7127002-AOOl/OOl  1 

12 

1  1 

173  7127002-fl002/002  1 

12 

1  1 

174  7127003-AOOl/OOl  N002101  1 

13 

1  4 

176  7127004-A001/001  1 

9 

1  1 

177  7127004-A002/002  1 

9 

1  1 

178  712700S-fl001/001  1 

14 

1  1 

180  7127006-AOOl/OOl  1 

1  1 

181  7127007-A001/001  1 

3 

19 

19 

19 

19 

15 

1  3 

182  7127009-A001/002  1 

16 

1  2 

183  7127009-A003/OC4  1 

16 

1  2 

184  7127009-fl005/005  1 

16 

1  2 

185  7127009-A006/006  1 

16 

1  2 

186  7127009-A007/007  1 

16 

1  2 

187  7127009-A006/O0e  i 

16 

1  2 

188  71C7009-A009/009  1 

16 

1  2 

193  7201023-flOOl/OOl  1 

1  1 

194  7201024-A001/001  1 

1  1 

195  720102S-A001/001  1 

1  1 

196  7202003-flOOl/OOl  1 

7 

8 

1  2 

197  7202006-AOOl/OOl  1 

21 

21 

18 

21 

21 

17 

1  3 

198  7203C02-fl001/001  1 

11 

1  1 

200  7203006-flOOl/OOl  1 

14 

• 

• 

1  1 

201  72030O9-AOO1/0O1  1 

9 

• 

6 

• 

7 

1  3 

1^ 


718 

767 


Table  B-6 
IRT  TREM)  MflLVSIS  ITEM  TABLE  FOR  AGE  13 


i  VCflB  1 

c 

9 
C 

9 
C 

0 
C 

c 

c 

c 

9  1 

c 

D 

c 

D 

D  1 

11 

11 

11  11 
11  11 

1^  m 

Id  19 

Id  Id 

1  MR 

\PNU\Nqci 

1 

C 

r 

w 

11 

19 

Ic 

17  1 
IJ  1 

1 
1 

9 
C 

7  1 

1 

c 

7  \k 
0  If 

1  9 
1  C 

7  4 

d  f 

1  Ttlina 

POS     ECS  ID         MAEP  I0\  1 

1  USED 

203  7203010-AOOl/OOl  1 

12 

204  720301 1-MOl/OOi  1 

16  1 

210  72O3O4fi-fl0Ol/OOl  1 

13 

212  7203048-A001/001  1 

13 

213  7203049-flOOl/OOl            1 7  1   *         *  1   *     '         *  1   *     *     *    *  1  1 

216  7203052-flOOl/OOl            1 6    *     *  1   *     *     *  1   *     *     *     *  1   '     *     *    *  1  1 

217  7203053-0001/001            1   *     *  5 

21S  7203054-AOOl/OO!  1 

10 

219  7227001-0001/001            ,   .     .     .    g    •     •     ■     •  1   *     *     *  1   *     '     *     *  1   *     '     *     '  1  1 

23fi  7301022-0001/002  1 

16 

17 

1  2 

237  7301Q22-A003/004  1 

16 

17 

•  1 

1  2 

• 

• 

• 

16 

17 

•  1 

• 

1  2 

219  7301022-A009/010  1 

• 

• 

16 

• 

17 

•  1 

• 

1  2 

• 

.  1 

1  1 

2ft  1  730301 7«A(y)l/Q01  1 

bBA    f  «IVmVI  (^nv  1/1/ WI  1 

• 

• 

• 

11  1 

12 

• 

• 

1  3 

2ft2  730301 7-A002/002  1 

• 

• 

• 

• 

• 

11  1 

12  • 

• 

• 

1  3 

2flA  7303019-0001/001  N001201  1 
fcSv  f«iiv«jvij  nvvi/wi  rfwibVi  1 

• 

2 

• 

2 

26 

1  4 

9 

b 

•  1 

2 

• 

1  3 

9Q1  71A1A9A-0AA1 1 

IS 

1*1 

15 

,  1 

• 

1  2 

IS 

15 

•  1 

•  • 

•  • 

1  2 

IS 

Id 

IS 

,  1 

1  2 

/tjUJUcD''l4UU3/UU3  > 

ts 

1  1 

7fA  77A7A<(f  .iVWI /AAI  UnA99At  1 
JIU  /•}U«3v31~NUUl/UUl  rlWCCVl  1 

7 

A  1 

D  1 

7 

1  14 

1  4 

7 

1 

7 

*  15 

1  4 

719  77A7/V^1«aAA7/AA7  IMV)99A7  1 

7 

D  1 

7  • 

1& 

1  4 

71 A  77A7M4«^kV>1 /0A1  1 

9  1 

10  • 

1  2 

9  1 

10  • 

1       *  * 

1  2 

11 
1 1 

1  1 

717  77A7M^«AAA7/AA7  1 

11 
1 1 

1  , 

,  1 

1       *  * 

1  1 

71 A  77A7MI^Y^1 /AA9  1 

ifl 

lO 

•  • 

1  1 

71 Q  77A70SA«aAA7/M4  1 

1A 

lO 

1       *  * 

1  1 

1ft 

• 

• 

1  1 

321  73030K-fl007/006  1 

18 

1  • 

1       *  * 

1  1 

322  7303057-0001/002  1 

17 

1  1 

323  7303057-0003/004  1 

17 

1  1 

324  7303057-0005/006  1 

17 

1  1 

325  7303057-0007/OOS  1 

17 

1  1 

342  7401016-0001/001  1 

10 

1  a 

9 

1  3 

345  7401024-0001/001  1 

7 

1  4 

4 

1  3 

346  7401030-0001/001  1 

1  1 

347  7401032-0001/001  1 

1  1 

348  7401066-0001/001  1 

1  1 

349  7401067-0001/001  N003001  1 

1  12 

13 

15  • 

1  3 

350  7401067-0002/002  N003002  1 

1  12 

13 

16  * 

1  3 

351  7401067-0003/003  N003003  1 

1  12 

13 

17  • 

1  3 

353  7401069-0001/001  1 

IS 

1  1 

357  7401073-0001/001  1 

10 

1  1 

358  7401074-0001/001  N001401  1 

11 

12 

21  * 

1  4 

359  7401075-0001/001  1 

4 

1  1 

362  7401078-0001/001  1 

1 

1  1 

719 

768 


Table  B~6 
IRT  TREND  ANALYSIS  ITEM  TABLE  FOR  AGE  13 


\  VEflR 
VPACKAGE 


22222222 
1    2    3    4    5  11   12  13 


POS      ECS  ID           nSftr  IPX  i 

1                1  1 

1  USED 

3d4  /WlOWHwl/Wl  i 

•    •     •     •   10    •    •  1   •     •     •  1   •     '    '     '  1 

• 

•  1  1 

 1    '     '     •  1    •     •     '  -1 

•  1  1 

371  7¥lfflclHWl/Wl  » 

.     .     1     •     •     •     •  1   6    •     •  1   7     •     •     •  1 

•  1  3 

37j  7¥/3w/*i<Wi7Wl  WWWjUI  i 

•    14    •     •     •  1    •   15    •  1   •   16    •  '1 

18 

•  1  4 

^  7¥M0l7iW0l^  Wi  rwuwui  1 

  1  14    •     •  1  15    '    '  '1 

17 

-  1  3 

.     .     q     ....  1    ...  1    ....  1 

• 

•  1  1 

3o5  7w3W^W01/UOl  !  

.1ft  1    •     •    12  1    '     '     •     •  1 

• 

'  1  2 

366  7yww5HWl/Wi              »  / 

 1   •     'SI'  •31 

■ 

*  1  3 

404  Hg2eOU(Hwl/001  WWloOi  i  

.    .       .    .    .    .  1  .    .    •  1  .    •    •   a  1 

8 

-  1  2 

405  H2ccOOO-w)02/002  NOOioue  1 

 1   •     •     •  1   •     '    •    8  1 

9 

'  1  2 

40b  He22000"W03/003  i 

 1    •     •     •  1   •     •     •    8  1 

•  1  1 

407  HcgOO(H<004/004  \ 

 1    •     •     •  1   •     •    •    8  1 

'  1  1 

40fl  HS2g0OO*'fl00B/0Oa  N00lb04  1 

 1    •     •     •  1   •     •     •    8  1 

11 

'  1  2 

A«Y    IV1A<  \AA    IIAA*  lAAl     klAAA  lAt  1 

417  lg41000HK)01/00^  N004401  1 

 1    •     •     •  1   •     •    •    3  1 

15  1  2 

A  «  A   t  MA  i  ^  AA  AAA4  lAAA                             1  * 

418  Hg4K^yH<003/OOc  1 

 1    •     •     •  1   •     '     '    3  1 

•     1  1 

419  H2410OOHH)O3/OO3   1 

 1    •     •     •  1   •     •    •    3  1 

•    1  1 

433  Hgba(X)(H«01/OOI  WOOcWl  1  

 1    •     •     •  1   •     '     '     •  1 

11 

*     1  1 

434  HcoaOOO-WOc/OOe  WUVcWd  i 

 1    '     '    '  1   •     '    '  '1 

12 

•     1  1 

44%  Hc84000-fl00 1/001  NOUJeOi  i 

 1   •     '     •  1   '     '    '  '1 

7 

•     1  1 

447  Hco400(rR004/004  WW»xU4  \ 

 1   '     '     •  1   '     '    '  '1 

10 

'     1  1 

446  HcobOOOnWO  1/001  WW/Ol  »  

 1   *     '     '  1   '     '    '  '1 

15 

'     1  1 

449  HcBoOOO-WOOf/OOc  NUW/Oc  1 

 1   •     '     '  1   •     '    '  '1 

16 

'     1  1 

450  >^8D00Q*wQ03/00i  WOW/Uj  »  

 •  1   •     '     '  1   '     '    '  '1 

17 

•    1  I 

463  H403000*ftOO  1/001  NOO/iWl  \  

 1   '     •     M   '     '     '    6  1 

1 

'  1  2 

1JAA9AAA..AAAO/AA0  klAA77A9      1  * 

464  H403000*»lOOc/OOg  NOO/JOc  i  

 1    '     •     M   '     '    '    6  1 

2 

•  1  ? 

LEW  ilAA^AAA -AAA9 1AA*>   yAA77A7       1  * 

465  H403000-W03/003  NOU/«>Mi  1 

  1        '     '  1   '     '    '    6  1 

3 

•  1  2 

A^^    IIAA4AAA  AAA  A /AAA    UAA77i*lA       1  * 

466  H403000*ft004/004  N00/JU4  \  

 1    '     •     •  1   '     •    '    6  1 

4 

*  1  2 

A^4   tlAA«AAA   AAAClAAS   kiAAy>Ag       1  * 

467  H403000-W05/005  HOOriOa  » 

 1    •     '     M   '     '    '    6  1 

5 

•  1  2 

A#A    tiAA^AAA  AAA^  /AA£    &1AA71A£       1  * 

468  H403000*w)06/006  N007306  i 

 1    •         '  1   '     '     '    6  1 

6 

'  1  2 

1         Ilk  (  9AAA— AAA9  /AA4  UAABOAd  1 

495  H4l3000-W0c/O02  NOOBcOc  \ 

 1    '     '     '  1   '     '    '     '  1 

2 

*  1  1 

AA^    liAi^AAA  AAA^  1AA9   UAAA4A?       1  * 

496  H4l3000-fl003/003  N006203  1 

 1    '     '     •  1   '     '     '     '  1 

. 

3 

'  1  1 

AM^    IIA  »  4AAA    AAAA  lAAA    AlAAAAAA         1  * 

497  H413000*fl004/004  N006g04  i 

 1    •     '     •  1   '     '     '     '  1 

. 

4 

•  1  I 

634                      M003202  1 

>               ■     ■■*     I     .  •* 
,     1     .     1     •     •     ■  1               1  1 

8 

'  1  JL- 

635  N003203  i 

 1   •     •     •  1   '     '     *     '  1 

9 

'  1  1 

636                      NOOSlOl  1 

 1   •     •     '  1   •     '     '     '  1 

2 

'  1  1 

»mm                                                  klAAKAAi        1  * 

637                      N005001  1 

 1   •     '     •  1   '     '    '     '  1 

6 

'  1  1 

m                                                         AlAA t  YA4       1  * 

638                     N00170S  1 

 1    '     '     '  1   '     '     '     '  1 

4 

'  1  1 

kiAAt  7A9  1 

639  N001703  1 

 1   •     '     •  1   '     '     '     '  1 

5 

'  1  1 

^  A  A                                                kiAA>V\A9       1  * 

640                      M00c003  1 

 1   •     •     '  1   '     '     '     '  1 

13 

*  1  1 

MAt                                          klAADA*       1  * 

641                       N003301  1 

 1   '     •     '  1   '     '    '     '  1 

19 

*  1  1 

642                       NOOlZOg  !  

 1   •     •     •  1   *     •    •     '  1 

27 

'  1  1 

64j                          NUW601  1  

 1   •     •     '  1   '     '    '     '  1 

20 

'  1  1 

644                     N0039O1   1  ' 

 1   •     •     '  1   '     •    '     '  1 

14 

•  1  1 

645                      N006201   1  ' 

 1    '     •     •  1   '     '    '     '  1 

1 

'  1  1 

646                      N008205   1  * 

 1    •     •     '  1   •     '    '     '  1 

S 

'  1  1 

647                     N001603  1  * 

 1   •     •     '  1   '     '    '     '  1 

10 

'  1  1 

648                      N003802   1  * 

 1   •     *     '  1   '     '    '     '  1 

• 

3  1  1 

649                      M003A03   1  * 

 1    '     '     '  1   '     '    '     '  1 

• 

4  1  1 

650                      N004201    1  * 

 1    •     '     '  1   '     '     '     '  1 

18  1  1 

651                       N004202   1  * 

 1   •     •     •  1   '     '    '     '  1 

19  1  1 

6  6  6 
1    2  3 


11  11  11  11  I  15  IS  IS  15  I  NO. 
1    2    3  14  I    1    2    3    4  lYEARS 


ERIC 


720 


769 


Table  B-6 
IRT  TREM)  AMLVSIS  ITEM  TABLE  FM  AGE  13 


\VEM    1  2 

6    6  1  11  11  11  11  1 

IS  IS  IS  is  1  NO. 

m 

BS  ID 

VPtOWGEl  1 
MEPID\  1 

2    3    4    S  11  12  13  1  1 

1 

2    3  11    2    3  U  1 

1  1 

12    3    4  lYEARS 
1  USED 

•     •    •    SI  1 

NUMBER  OF  OALIBMIED  I  I  I  I 

ITEMS  IMMnoCT  I  19  15  15  15    9  20  34  15  I  30  19  30  I  26  15  22  14  I  IS  15  19  14  I 

NMER  OF  ORLIMWTES  ITEMS      I  I  I  I  I 

LIMKIM6  MOKIET  ACROSS  VEftRSIfilO    9    6    2    8  10    5l  29  19  30lg6lS2glOI8    6    9  71 

NU«eR  OF  CALIBRATED  ITEMS      I  I  I  I  I 

LINKINB  BOOKLET  WITHIN  YEARS  I00000000I444I4440I0000I 


{  I  V 


721 


Table  B-7 
IRT  TREND  ANALYSIS  ITEM  TMLE  FOR  AGE  17 


\  YEAR  1 

2 

2 

2  2 

2 

2 

2 

2  1 

6 

6 

6 

11 

11 

11  11 

1  IS 

15 

IS  IS  1  NO. 

NPOCKOSEI 

2 

3 

\  S 

7 

8 

9 

10  1 

1 

2 

3  1 

1 

2 

3  11 

1 

2 

3    4  lYEARS 

NREP  I0\  1 

1  USED 

5  7WW-W01/W 

1 

• 

• 

• 

•  1 

• 

•  • 

•     •  1  } 

1 

• 

i 

•  1 

2 

•  • 

•     •  1  ? 

1 

9 

• 

6 

•  1 

9 

•  • 

•     •  1  ? 

11  709)006-A002/O0e 

1 

5 

• 

6 

•  1 

8 

•  • 

•     •  1  } 

12  7M9M7-A0O1/OO1 

1 

u 

«  1 

6  • 

•     •  I  ? 

13  7099007-MOe/OOe 

NOOS002  1 

13 

6  1 

ft  • 

7 

*  1  4 

I*  7W?W7-wyw 

N00S0O3  1 

I? 

6  1 

6  • 

9 

•     •  1  4 

16  7O990O^«0Ol/OOl 

N003601  1 

•  7 

• 

? 

2 

13 

•     •  1  4 

N003602  1 

7 

• 

2 

14 

•     •  1  4 

19  7099012-ftOOl/OOl 

N003S01  1 

• 

11 

14 

10 

•  1  4 

20  7099013-AOOl/OOl 

1 

■  1  1 

21  70990l3Htt02/002 

1 

9 

■     ■  1  1 

1 

? 

■  1  1 

48  7099020-flOOl/OOl 

1 

■  1  1 

49  7O99020WWJ2/0O2 

1 

i 

■  1  1 

SO  7O990eO-MO3/OO3 

1 

■     ■  1  i 

1 

u 

■  1  1 

S3  7a99021-fl003/0O3 

1 

■  1  1 

Si  7O99021-A0O4/0O4 

1 

M 

■  1  1 

1 

9  1 

■  1  1 

SS  7O99023-A0O2/00e 

1 

9  1 

■     ■  1  » 

89  7M9a23-fl003/003 

1 

9  1 

•     •  1  I 

62  7(.99(J26-«01/00l 

1 

4 

•  13 

« 70?9^»-«oo8/o?? 

1 

6 

4 

•     •  >  ? 

64  7099(tt6-<»03/003 

1 

4 

•  12 

«  7W087-«Wt/9P| 

1 

•     •  1  ? 

67  7101009W)001/001 

1 

•  1  2 

69  710101»^>001/00e 

1 

•  »9 

■  1  1 

W  710J9I?^W4 

1 

•  18 

■  1  1 

71  71O1019-A00S/OO6 

•  18 

72  7101019-<»007/O0e 

•  18 

73  7101019-0009/010 

-  18 

74  7101019WW1 1/012 

•  15 

75  ?iOWJ9-<»w/W4 

18 

94  7102013-A001/001 

19 

20 

18  1 

20 

20 

13    •  1 

95  7102014-aOOl/OOl            1 6    •     •  1   •     •     •  |   "     •     •     •  |   "     •     •     •  |  » 

96  71<K015-fl001/001            ,   ...     j  j 

107  7102035-flOOl/OOl 

106  7102036-flOOl/OOl            ,   ...    3  j 

109  7102037-flOOl/OOl 

t? 

110  710e037-i»002/002 

113  7103004-flOOl/OOl 

^ 

114  7103012-«01/001 

11 

lis  7103017-flOOl/OOl            1 6  1   •     •     •  1   •     •     •     •  1   •     •     •     •  1  1 

7JWJH09I/WI 

3 

12S  7103031-flOOl/OOl 

• 

126  7103032-flOOl/OOl 

1 

1 

•13 

133  7103039-0001/001 

9 

9 

•  -12 

134  7103041-0001/001 

11  1 

7  1 

7    •  1 

•  13 

723 


771 


Table  B-7 
IRT  TREM)  PMLVSIS  ITO  TMli  FOR  AGE 


17 


\  VEM  1 
\PaOW6EI 

P08     ECS  ID          MPEP  IPX  1 

2 
2 

2  2 

3  4 

2 
S 

2 
7 

2 
8 

c 
9 

4  1 

c  1 
10  1 

ft 

1 

6 

A 

2 

C  1 

0  1 

1  1 

11 

1 

1 1 
11 

2 

1 1  II 
11  11 

1  II 

4  11 

1^   m,  \%  fic 
19    13    19  13 

1       9       1  A 

1  un 

1  ncm 

135  7lO31Hl-fl00e/002  L 

U.J 

7  1 
/  1 

/ 

— ^ 

13i  7lOJO»l-«0a3/0O3  L 

UJ 

7  1 

/ 

\X7  7iaiHl-«MH/flO»  L 

tu 

7  1 

/ 

■  1  ■ 

12 

la  nwamm  < 

\Z 

— 1 

1*0  71030»3-fl00i/00i  1 

W 

1*1  7lO3OM^i/00l  N0017O1  1 

1 

m  1 

li 

Itf  71O3O»H>00e/fl0g  1 

A 

m  1 

u 

142  7103044-4003/003  1 

A 

i 

• 

1  ■ 

^44  7103048-0001/001  NOflSaOl  1 

OA 

1     •       •       •  97 

144  71030»5->>003/003  IIQ05203  1 

(0 

1      *       *       *  9^ 

147  71030»6-<001/001  I 

14i  71030^4-ftMe/OOe  1 

4 

I 

jSLMSdStmm  1 

» 

ig  7i(ao4»<Doe/ooe  i 

\i 

jsLim^mm  i 

19&  71(a)82-AMl/001  1 

ifi2  71030ai-A001/001  1 

11 

la  7103089-flOOi/ooi  1  .    .    .    .   ^  •    •   •  j  •    •    •  i  -   -    -    -  ^  -    '   '    '  ^  ^ 

(ftA  7iaaofi<WAMi/AAi          1  1  1  •    •    •  1  •    •       •  1  •  : — : —  '  > 

itf  71O30I2-MO1/OO1  1 

\P 

7103062-0002/008  ! 

\P 

l&l  71O3062-O003/003  1 

\P 

IM  7127001-AOOl/OOl  NOOSaOl  1 

i  9 

170  7127001-MOe/OOe  1 

1  9 

1? 

171  7127001-A003/003  1 

1  ? 

M 

174  7127003-0001/001  11002101  1 

}0  1 

10  • 

17t  7127004-0001/001  j 

'  » 

177  7127004-O002/00e  1 

-  12 

tiA  7l»mOMAai/A01             13 ,   ...  1   ....  1        ■     •     •  1  \ 

101  7127007-0001/001  

1 20 

W 

19 

1?  • 

102  7127009-OOOl/OOe 

<5 

103  712700M003/004 

104  712700M005/005 

IPS  7127009-OOOt/OOi  

19 

too  712700^0007/007  

t? 

19 

107  7127009-0001/001  

»9 

t9 

1? 

^9 

191  7801013-0002/003 

10 

193  7201023-AOOl/OOl            1 SI   "     "       1  — ! — L_LJ — — ! — LL_L. 

79AiflM^l/Aai             1 7    .  1   ........  1    •     •  J — •  1  I 

198  7201028-0001/001  

•  W 

|§0  7208003-0001/001  

I  i 

1  ? 

1^  7202000-0001/001 

i  I* 

17 

}9 

17  • 

^  72O300&-O0O1/0O1 

201  7203009W>001/001 

1  1? 

1  »9 

202  72OS009000O1/0O1 

13 

9A3  TMaOIA^i/AOl               17 ,    ......      .   ;  ;  

a04  7203011-0001/001  

20S  7203012-0001/001 

U 

1?  • 

724 


Table  B-7 
IRT  TICfi  AMLVSIS  ITER  IttS  FOR  «E  17 


\  fSMR  1 

m      US  If           m&  IDA  1 

C 
< 

C 
4 

C  C 

c 

7 

c 

A 

c 

e 

OA 

IV 

1  i 
1  I 

i 

2 

§ 

1  1 1 
1  11 

1  1 

11 

2 

It    1 1  1 
11    11  1 

7     tt  1 

4    11  1 

19  19 
1  2 

tc 
19 

7 

4 

IS  1 

idK7£930l>«Ml/001  1 

\i 

tl 

'  n 

2U  TSOlMI-MOl/aM  1 

to 

m  TaUDBMitt/W  !  *       '  '    >    >    >   1 1  .    ■    >  ^  >    .    >    .  1  >    >    .   .  1  1 

117  TaSJOSHMOl/OOl                                7  , 

-11- 

1  tK 

-1— £  - 

19 

*3 

*   1  9 

M  WlWTWWmt^   1 

19 

toe 

-12- 

*    1  9 

Ji- 

-Jl. 

*    1  9 

1*1 

IS 

'  1 

11  1 

*   1  7 
1  4 

mm  flWfflff  ffWifm  ' 

Ji. 

TUfflWl  flnfttfflfli  1 

*  1  1 

tSI  nnTnia  gnnnffinf  1 

14 

 ^  I. 

14 

*  1  t 

14 

— !  L- 

ai  7atkl5l7^Mll/Qlll  1 
Hi  fwWVtf^^tfWi  ' 

14 

tA 

11 

'  1  1 

14 

lA 

11 
.JjL 

!  ?^ 

A 

a 

9  1 

9      *  1 

9A 
CA 

*   1  A 

• 

A 

V 

9  1 

9     '  1 

 ! — 

• 

ts 

-12- 

•A 

*   1  9 

 '  .i 

• 

IIL 

*  1  1 

1\6  73tllflSl«ttk0t/A0l  IIOA9Mt  1 

A 

J.- 

lA 

1^ 

*   1  A 

lit  yiA^I  IWfffflft^  llM09(i9  1 

A 

f  ■ 

lA 

—13- 

*   1  A 

tkP  73ll3Ml«Attl/ilAl  IIMdMS  1 

-JL. 

A 

1A 
JS-. 

lA 

*   1  A 

11 A  TMlML^MHiMM  1 

*  11 

Jl- 

JL. 

*    1  7 

1  4 

US  T'^V**       /fwy  1 

11 

Ji- 

Ji. 

*    1  7 

f 

*    1  1 

.JL 

*     1  4 

-LU 

-All  7Jwai*wwww   t 

. '  L. 

ISA  THIMit-MMiMC  1 

-if  1 

•  1  1 

191  71fiMt^M(7/AAA  i 

Jci  f JMWigwf/wi   1 

JfJ 

*  1  f 
•  * 

322  73030S7-4001/002  1 

17 

*  1  1 

 ! — 1 — 

323  730aOS7-MIO(lA)OI  1 

|7 

•  1  \ 

|7 

•  .•  1 

3B  7aO]IIS?^/O0i  1 

17 

•  1  1 

j»  mw-mm  i 

\ 

•  1  1 

343  7M10M^1/001  1 

ll 

'  1  .1 

34&  7M1«3^/001  1 

3*7  74010&^1/M1  1 

•  1  1 

3M  7401017^1/001  NOOaOOl  1 

7 

19 

•  1  4 

1 

7 

? 

'  1  4 

3S1  74O10&7-MO3A»3  ilOO.1003  1 

7 

|7 

'  1  4 

3S4  74010KHi001/001  1 

■  •  \ 

3SI  740107*^1/001  N001401  1 

1  ' 

3 

'  1  4 

M6  74O1O7HI001/001  1 

f 

'  1  1 

Table  B-7 
IRT  TREND  ANALYSIS  ITEM  TABLE  FOR  AGE  17 


\  (EAR 
\PfOm 


ZZZZZZZZ 
2345789  10 


383  7401079-AOOl/OOl  1 

•     -14 1   •     ■     •  1 

•     •    •    •  1 

•  1 

1 

384  740i08Hl001/001  1 

 3    •  1   •  -31 

•     ■    3    •  1 

•  1 

3 

387  7401083-AOOl/OOl  1 

•     ■     2 1    •   12    ■  1 

■   12    •  '1 

•  1 

3 

37S  74030e;-M01/001  N0O4901  1 

•     ■     •   14    •     •     •     •  1   •     5    •  1 

•    5    •    •  1 

18 

•  1 

4 

377  7403019-0001/001  N002S01  1 

  4    •     ■  1   •     •    12  1 

•     •   13    •  1 

17 

•  1 

4 

361  790300SHI001/OO1  1 

 4  1"     ■  "1 

•     ■     •     •  1 

•  1 

I 

3B  7SO3044-A0O1/OO1  1 

•     •    3 1   •    6    ■  1 

•     ■    •     •  1 

•  1 

2 

388  7S0304S-fl001/001  1 

 11    ■  "1 

1    ■     •    •  1 

•  1 

2 

390  H202000-M01/001  1 

 1   '     ■  "1 

•  ••41 

•  1 

I 

391  H20e000-M02/O02  1 

 1   ■     ■  "1 

■     ■    •    4  1 

■  1 

392  »«a02000-A0O3/OO3  1 

 1   ■     ■  "1 

•     ■     •    4  1 

•  1 

1.. 

393  H2O200O-A004/OO4  1 

 1   ■     ■  "1 

•     •     ■    4  1 

•  1 

I 

433  H285000-A001/001  N002001  1 

 1   ■     ■  "1 

■     ■    •     •  1 

11 

•  1 

1 

434  H28S000-A00e/O0e  N00200;  1 

 1   ■     ■  "1 

•     ■    •     •  1 

12 

•  1 

I 

444  H2B4O0O-A001/OO1  N003201  1 

 1   •     •  "1 

•     •    •    •  1 

7 

•  1 

I 

447  H28400O4004/OO4  N003204  1 

 1   •     •  "1 

•     ■     •    •  1 

10 

•  1 

1 

4S3  mO3000-A0Ol/OOl  N007301  1 

 1   •     •  "1 

•••61 

I 

■  1 

2 

484  H4O300OW»0e/O0e  N007302  1 

■  - 1   •     •     •  1 

•••61 

2 

•  1 

2 

465  H403000ti003/003  N007303  1 

O  1 

•  1 

2 

468  H4O3000-fl004/O04  N007304  1 

 1   •     ■     •  1 

•■•61 

4 

•  1 

2 

467  H4O300O-fl00S/OO5  N007308  1 

 1   •     •  "1 

■■•61 

5 

•  1 

2 

488  H403000-A008/006  N007306  1 

 1   ■     ■  "1 

■■•61 

6 

•  1 

2 

495  H413000-A002/002  N00a202  1 

 1   •     ■  "1 

■     •     •    •  1 

2 

•  1 

I 

498  H413000-A003/003  N00a203  1 

 1   ■     ■  "1 

■     •    •     •  1 

3 

•  1 

I 

497  H41300O-A0O4/O04  N008204  1 

 1   ■     ■  "1 

■    •     •    •  1 

4 

•  1 

I 

518  H441000-fl001/001  1 

 1   ■     ■  "1 

•     ■     •    7  1 
•■•71 

•  1 

•  1 

1 
1 

517  H441000-i)002/002  1 

518  H441000-fl003/003  1 

 1   ■     ■  "1 

 1   ■     ■  1 

•••71 

■  1 

I 

834                     11003202  1 

 1   ■     ■  "1 

•     ■     •     •  1 

8 

•  1 

I 

635                     N003203  1 

•     •     ' 1   ■     ■  "1 

•     •     •     •  1 

9 

•  1 

638                     N005101  1 

 1   ■     ■  "1 

•     •     •     •  1 

2 

•  1 

1 

837                      N005001  1 

 1   ■     ■  "1 

•     •     •     •  1 

6 

•  1 

1 

838                      N001702  1 

 1   ■     ■  "1 

•     ■     •     •  1 

4 

■  1 

L 

839                     N0C1703  i 

 1   ■     ■  "1 

•     •     •     •  1 

5 

•  \ 

640                      N002003  1 

 1   ■     ■  "1 

•     ■     •     •  1 

13 

•  1 

I 

641                     N003301  1 

 1   •     •     •  1 

•     ■     •     •  1 

19 

•  1 

I 

842                     N001202  1 

 1   ■     ■  "1 

•     ■     •     •  1 

27 

•  1 

1 

643                      N00480i  1 

 1   ■     ■  "1 

•     •     •     •  1 

20 

•  1 

I 

644                      N003901  1 

 1   •     ■  "1 

•     •     •     •  1 

•  1 

1 

845                      N008201  1 

 •  1   •     •  '1 

•     •     •     •  1 

1 

•  1 

648                     N008205  1 

•     •     •     •  1 

5 

•  1 

847                      N001803  1 

•     •     •     •  1 

10 

•  1 

848                     N003802  1 

•     ■     •     •  1 

3  1 

649                     N003803  1 

•     •     •     •  1 

4  1 

850                     N004201  1 

•     ■     •     ■  1 

18  1 

851                     N004202  1 

•     •     ■     •  1 

19  1 

852                      N002102  1 

•     ■     ■     •  1 

6  1 

853                      N008108  1 

•     ■     ■     •  1 

14  1 

6  6  6 
1    2  3 


11  11  11 
1    2  3 


11  I 
II  I 
I 


IS 
1 


IS 
2 


IS  IS  I  NO. 
3    4  IVEARS 
I  USED 


726 


774 


Table  B-7 
IRT  TREND  ANALYSIS  ITEM  TABLE  FOR  AGE  17 


\  YEAR  1 
\PACKAGEI 

2 
2 

2 
3 

2 

2 
a 

2 

7 

2 
B 

2 

2  1  fi 

10   1  t 

6 

0 

c 

6  1  11 
1  1  1 

i   1  1 

11 
c 

11 

11  1 

1  1  1 

IS 

1 
i 

15 

9 
C 

15 

tj 

15  1  NO. 
4  lYEARS 

POS     ECS  10          NPEP  I0\  1 

1 

1 

1 

1  USED 

6^  WWWWC__L 

16  1  L 

fiss               mvm  1 

17  1  1 

NUMBER  OF  CM.IBRATE3  1 
ITEMS  IN  BOOKLET  1 

18 

21 

10 

JS 

21 

23 

14 

1 

18  1  30 

18 

1 

30  1  22 

17 

22 

1 

13  1 

12 

15 

16 

1 

12  1 

NUMBER  OF  CALIBRATED  ITEMS  1 
LimiNS  BOQKLfT  ACRCaS  YEARS  1 

7 

11 

S 

9 

7 

8 

10 

1 

S  1  29 

18 

1 

30  1  22 

17 

22 

1 

6  1 

8 

8 

6 

1 

4  1 

NUMBER  OF  CALIBRATED  ITEMS      I  'a 
LINKING  BOOKLET  WITHIN  YEARS  1000    0000013  331333010000 


775 

727 


Table  B-8 


Item  Parameter  Estimates  and  Standard  Errors 


I  cam 

ETS  lu 

a 

1 

NOOllOl 

A    O  /  OOil 

0. 343oo 

2 

N001201 

A    *71  1  QO 
0« /113Z 

3 

N001202 

1. 2/63/ 

4 

N001301 

0. 9oj91 

5 

N001302 

A  ATO 

0. 71972 

6 

N001303 

1*33406 

7 

N001401 

0. 99901 

8 

NOOIdOI 

1 • o0oZ4 

9 

N001502 

1 • 043iD 

10 

N001503 

1. 34461 

11 

N001304 

1    y.  y.  "7  "7  Q 
1. 44/ /o 

12 

^lAAl  CA^ 

N001506 

A    £  C /  O^ 

0.65437 

13 

E1A/\1  ^A1 

NUU1601 

A  ilOO^O 

0. 62223 

14 

ElAA  1  ^  AO 

N001602 

1. 26259 

1  c 

15 

mAAI £A0 

N001603 

A    O  1  CilO 

0«ol562 

16 

N001604 

1     OT/ Hi 

1. 3/491 

17 

UAAI ^A1 

N001701 

A    001  Oil 

0. 98126 

18 

UAAl ^AO 

NUU1702 

A    C / AOA 

0.54099 

19 

maai ^ao 

N001703 

1     A01 A^ 

1.0810/ 

21 

^lAAl  OAO 

N001802 

1    cool  1 
1. 59211 

22 

MAA1 AA1 

N001901 

1 . 0433/ 

23 

MAA1 OAO 

N001903 

A  o  0  y.  o  o 
U. 734ZZ 

24 

MAAO AA1 

N002001 

ZD 

MAAO AAO 

NUU2UU2 

1   y.  y.  0  Q 1 

26 

M/^AO  AAO 

N00200i 

1 . DoZoU 

27 

MAAO 1 A1 

N002101 

A  Qy.AQO 

Zo 

MAA01 no 

1   y.  o  y.  Q  y. 

29 

MAAOO A1 

N002201 

1. /UjoI 

30 

MAAOO AO 

N002202 

1 . 35/86 

O  1 

31 

f  tAAOO  AO 

N002203 

0. /8303 

oo 
32 

KlAAO  y.  A1 

N00Z401 

1   y.  y.  ono 

33 

MAAOQAI 

NUUZDUi 

A  Qy.QrQ 

34 

N002701 

1.02419 

35 

N002702 

1 . 14818 

36 

N002801 

1.92053 

37 

N002802 

1.89576 

38 

N002803 

0.33105 

39 

N002902 

0.55751 

40 

N002903 

2.31343 

41 

N002904 

1.28934 

42 

N002905 

0.75794 

43 

N002906 

1.96425 

44 

N003001 

1.29316 

s.e. (a) 


s.e. (b) 


s.e. (c) 


0.04756 
0.18319 
0.18712 
0.11808 
0.08748 
0.13290 
0.09413 
0.13010 
0.09754 
0.08765 
0.08909 
0.04285 
0.04094 
0.07887 
0.07272 
0.10123 
0.06647 
0.11563 
0.08003 
0.14013 
0.11147 
0.03812 
0.06511 
0.08389 
0.09278 
0.09421 
0.09999 
0.11834 
0.11962 
0.06634 
0.09561 
0.05253 
0.10249 
0.07671 
0.11379 
0.10954 
0.02801 
0.04967 
0.18003 
0.09451 
0.05783 
0.14802 
0.10896 


-0.38421 
1.14426 
0.58486 
0.49490 
-1.54817 
0.40675 
0.00071 
-1.31319 
-0.30728 
-0.90181 
-0.65029 
2.07853 
-0.95892 
-0.69250 
-0.03084 
0.11143 
-0.41778 
2.65058 
0.00329 
0.72655 
0.20973 
0.15576 
-0.01271 
-0.04179 
-0.22910 
1.17114 
0.84044 
-0.12913 
-0.34881 
-1.13922 
-0.37505 
0.12918 
0.83348 
0.05508 
-0.76744 
-0.91192 
2.20046 
-0.80149 
-0.34122 
-0.02029 
0.24793 
-0.36322 
1.15281 


0.10402 

0.40276 

0.19679 

0.17315 

0.24087 

0.13115 

0.11426 

0.15232 

0.06080 

0.08628 

0.06788 

0.14371 

0.08244 

0.06605 

0.06802 

0.06294 

0.09050 

0.62070 

0.09642 

0.13110 

0.09766 

0.02868 

0.04963 

0.05539 

0.05407 

0.17573 

0.11809 

0.07771 

0.11154 

0.13696 

0.05657 

0.10047 

0.16436 

0.06496 

0.08091 

0.09238 

0.18826 

0.11433 

0.08178 

0.08728 

0.08254 

0.08228 

0.16887 


0.29068 

0.36925 

0.25554 

0.40040 

0.49736 

0.28084 

0.25110 

0.22542 

0.18160 

0.20744 

0.17293 

0.00000 

0.13299 

0.25043 

0.23261 

0.26918 

0.23118 

0.23141 

0.29079 

0.21661 

0.33071 

0.00000 

0.13080 

0.20290 

0.22409 

0.24676 

0.14741 

0.20050 

0.33683 

0.23625 

0.12753 

0.20489 

0.23428 

0.14060 

0.17456 

0.14345 

0.00000 

0.22919 

0.25293 

0.19746 

0.11561 

0.23041 

0.20713 


0.05300 

0.05312 

0.03702 

0.04266 

0.09718 

0.03014 

0.05479 

0.04742 

0.02631 

0.04272 

0.03234 

0.00000 

0.04576 

0.03097 

0.03311 

0.01844 

0.05878 

0.02826 

0.04417 

0.01183 

0.02786 

0.00000 

0.02011 

0.02023 

0.02191 

0.01860 

0.01242 

0.03659 

0.05918 

0.08550 

0.02317 

0.05712 

0.03194 

0.02286 

0.02795 

0.02805 

0.00000 

0.07069 

0.03950 

0.04124 

0.04032 

0.04409 

0.01285 


729 


1^ 


/76 


Table  B-8 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 


1  lem 

17TC  Tn 

cx 

HD 

WUUjUUZ 

\J  •  nJ\J  y  XW 

HO 

9  29397 

A7 

NvU^  11/ 1 

1  57066 

1.  •  J  t  www 

AO. 

ni/ujiuz 

1  S3025 

AO 

NUU^lUJ 

0  70389 

CA 
DU 

NtlU^ZUl 

1  9070Q 

<\1 
Dl 

ni/U3zuz 

1  590A7 

DZ 

MHO '^90'^ 

1  21513 

X  •     X ^ x*^ 

<\7 
Dj 

1  A5667 

NUU^ JUl 

1  1A1S0 

cc 
DD 

Mfifi'^Ani 

rlUUj»*Ul 

1  A672A 

X  •  "tW  /  £t*r 

Ci: 
JO 

Nvvj JUl 

0  75144 

3/ 

NHO'^Ani 

nUvr^OUl 

1  45231 

3o 

Nnn'^An9 

ni/w^owz 

1  31985 

0  73641 

\J  •  1  ^  W*T  X 

OU 

wjyjD  1  wz 

1  07106 

^  •  \I  1  X  W  V 

01 

Nnn'^7n'^ 

0  68872 

OZ 

NOO'^ftni 
ni/w^owx 

0  89130 

W  •  W  ^  X 

0.41376 

0.75737 

AS 
0  J 

1.37453 

AA 

N00A009 

0.61451 

w  / 

NOOAlOl 

1.09618 

Aft 
wO 

N00A901 

1.10307 

AQ 

NnnA9n9 

0.76199 

1.41953 

71 

0.62069 

79 

N00AA01 

1.71824 

7'^ 

N00AA09 

0.87572 

74 

N004403 

ft^  V  V              V  *^ 

1.64193 

75 

N004501 

0.97362 

76 

N004502 

0.68013 

77 

N004601 

0.89929 

78 

N004602 

1.31823 

79 

N004603 

1.48506 

80 

N004604 

0.79691 

81 

N004701 

1.69375 

82 

N004702 

0.76376 

83 

N004703 

1.02059 

84 

N004801 

1.25733 

85 

N004901 

0.91600 

86 

N005001 

1.99291 

s.e. (a) 


s.e.(b) 


s.e. (c) 


0.02912 

0.10854 

0.10038 

0.08338 

0.04221 

0.08792 

0.12373 

0.10141 

0.11979 

0.08108 

0.14986 

0.06201 

0.11616 

0.10929 

0.06121 

0.08364 

0.03648 

0.11188 

0.02983 

0.09309 

0.19223 

0.07927 

0.08674 

0.07055 

0.07167 

0.12483 

0.04330 

0.12695 

0.07482 

0.12805 

0.10339 

0.05431 

0.07817 

0.10310 

0.11293 

0.04948 

0.10145 

0.06512 

0.06467 

0.08452 

0.05695 

0.10250 


0.11920 
1.72388 
-0.64538 
-0.35908 
1.92324 
-0.59316 
0.01181 
0.23965 
0.25952 
-0.40955 
-0.20675 
-0.44828 
-0.66817 
-0.13019 
-0.76037 
-0.01031 
0.29437 
1.46486 
-0.70293 
1.60016 
-1.84669 
-1.42643 
-1.12198 
0.03059 
0.18683 
0.40405 
0.58170 
-1.77446 
-0.22006 
-1.46711 
0.49345 
-0.82441 
0.17933 
-0.08468 
-0.51585 
-0.61689 
-0.51490 
-0.92837 
-0.65101 
-1.25766 
0.22127 
1.37994 


0.06518 

0.18958 

0.07260 

0.05122 

0.12434 

0.08731 

0.09272 

0.10682 

0.11171 

0.07830 

0.09152 

0.09301 

0.09937 

0.09684 

0.10437 

0.07761 

0.03458 

0.25097 

0.07824 

0.24505 

0.33113 

0.21398 

0.11390 

0.06151 

0.09769 

0.13117 

0.05603 

0.20185 

0.06568 

0.17026 

0.15138 

0.10455 

0.10423 

0.09194 

0.08941 

0.05145 

0.05860 

0.10528 

0.06202 

0.10848 

0.05969 

0.15923 


0.16796 

0.11968 

0.26709 

0.14507 

0.00000 

0.17078 

0.22727 

0.22248 

0.23829 

0.15767 

0.15919 

0.17212 

0.20310 

0.24110 

0.23915 

0.23644 

0  00000 

0.30911 

0.11010 

0.20621 

0.23174 

0.24613 

0.22894 

0.18470 

0.29069 

0.28769 

0.00000 

0.26243 

0.14810 

0.22811 

0.30495 

0.17982 

0.18399 

0.24890 

0.22557 

0.00000 

0.20392 

0.23709 

0.15261 

0.24193 

0.19010 

0.21076 


0.04095 

0.00632 

0.C3160 

0.02252 

0.00000 

0.05590 

0.03782 

0.03861 

0.03463 

0.04884 

0.04704 

0.06140 

0.05986 

0.04772 

0.05983 

0.03157 

0.00000 

0.01789 

0.04749 

0.01882 

0.08938 

0.09327 

0.05358 

0.02437 

0.03786 

0.03220 

0.00000 

0.06523 

0.03611 

0.05425 

0.04322 

0.06770 

0.04832 

0.04420 

0.05438 

0.00000 

0.02105 

0.05750 

0.03124 

0.04726 

0.02127 

0.01059 


73C 


777 


Table  B-8 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 
Item     ETS  ID         a      s.e.(a)         b      s.e.(b)  c  s.e.(c) 


87  N005002  0.85936  0.10848    1.28831  0.24040  0.26406  0.02938 

88  N005003  0.73710  0.10541    1.90503  0.33064  0.13504  0.02371 

89  N005101  0.84190  0.06067  -2.13982  0.17848  0.23555  0.08295 

90  N0O5201  0.67359  0.10746    0.63625  0.22965  0.48066  0.05423 

91  N005202  0.59985  0.07072    0.58164  0.15242  0.20628  0.05831 

92  N005203  1.14306  0.12119    1.83713  0.28411  0.30875  0.01452 

93  N005301  1.13281  0.14635  -0.02794  0.13187  0.28349  0.05862 

94  N005302  1.40569  0.14546    0.38677  0.11915  0.12855  0,02959 

95  N005303  0.86748  0.19494    1.00844  0.34366  0.32993  0.04769 

96  N005304  1.81002  0.19697    0.05192  0.11407  0.22677  0.03838 

97  N005305  1.08554  0.12148  -0.67673  0.13021  0.22225  0.07653 

98  N005403  1.34660  0.15280  -0.33453  0.11463  0.28866  0.06005 

99  N005404  1.45537  0.13848  -1.03748  0.14409  0.18712  0.06678 

100  N005405  2.01849  0.19497    0.06798  0.09988  0.20587  0.03105 

101  N005406  1.20953  0.11578  -0.39768  0.09352  0.18463  0.05446 

102  N005407  1.77747  0.20114  -0.24601  0.11000  0.32637  0.04931 

103  N005503  0.71843  0.07420   0.35569  0.12684  0.21105  0.05387 

104  N005504  1.31644  0.11181    0.77755  0.14729  0.21947  0.02374 

105  N005505  1.12595  0.09159  -0.91282  0.12097  0.24680  0.07913 

106  N005601  1.38681  0.15119  -0.65281  0.12495  0.25274  0.07133 

107  N005602  1.71547  0.18749    0.29673  0.13264  0.20783  0.03130 

108  N005603  1.48703  0.17122  -0.17746  0.11343  0.30619  0.05075 

109  N007301  1.18343  0.09087  -0.39389  0.09968  0.27841  0.05857 

110  N007302  0.81787  0.05868    0.28535  0.08412  0.1^546  0.03885 

111  N007303  1.10993  0.07680  -0.02429  0.08404  0.19644  0.04296 

112  N007304  0.88667  0.07155  -0.00694  0.09984  0.22304  0.05305 

113  N007305  0.52937  0.04195    0.01012  0.07697  0.13321  0.04958 

114  N007306  1.00946  0.05679  -0.11609  0.05916  0.10318  0.03478 

115  N007401  1.09780  0.07624    0.53070  0.09581  0.12305  0.02729 

116  N007402  1.30369  0.08423  -0.31735  0.07522  0.17614  0.04544 

117  N007403  1.75588  0.11861    0.21398  0.09309  0.23342  0.02691 

118  N007404  0.98461  0.07227    0.05998  0.08757  0.18069  0.04382 

119  N007405  0.88671  0.10164    1.40133  0.22916  0.18663  0.02508 

120  N007407  0.85127  0.04249    0.67125  0.04940  0.00000  0.00000 

135  N008201  2.72430  0.30170  -0.47105  0.13058  0.32310  0.05201 

136  N008202  1.14590  0.10210  -0.06540  0.10238  0.18767  0.05219 

137  N008203  1.54336  0.14080  -0.28855  0.10359  0.24721  0.05374 

138  N008204  2.59971  0.23614  -0.22772  0.09180  0.20928  0.03761 

139  N008205  2.14522  0.18750  -0.25591  0.09158  0.20457  0.04198 

140  N008207  0.59778  0.06018    2.25854  0.23732  0.00000  0.00000 

141  N008601  1.78949  0.17931  -0.97240  0.17067  0.16947  0.03727 

142  N008602  1.36797  0.17927  -0.55448  0.12210  0.26122  0.04075 


731 


778 


Table  B-8 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 


Item 

ETS  ID 

a 

143 

N008603 

1.20570 

144 

N008701 

1.19247 

145 

N008801 

1.48897 

146 

N0089C1 

1.32842 

U7 

N008902 

1.25848 

148 

N008904 

0.67255 

149 

N009001 

1.32810 

150 

N009002 

1.17681 

151 

N009003 

0.84446 

152 

N009004 

1.76785 

153 

N009101 

1.00715 

154 

N009201 

1.79466 

155 

N009401 

1.88222 

156 

N009601 

1.36014 

157 

N009701 

1.08207 

158 

N009702 

1.95945 

159 

N009703 

1.44885 

160 

N009704 

1.14952 

161 

N009705 

1.95691 

162 

N009801 

1.39630 

163 

N009901 

0.97641 

164 

N010002 

1.29007 

165 

N010003 

1.65704 

166 

N010102 

1.12440 

167 

N010103 

1.79514 

168 

N010201 

1.24259 

169 

N010301 

0.70206 

170 

N010401 

0.71532 

171 

N010402 

0.92807 

172 

N010403 

1.03079 

173 

N010501 

2.02312 

174 

N010502 

1.20386 

175 

N010503 

1.45500 

176 

N010504 

2.30030 

177 

N010601 

1.60441 

178 

N010602 

1.78849 

179 

N010603 

1.35893 

180 

N010604 

1.63695 

181 

N010605 

1.21990 

182 

N010701 

1.06419 

183 

N010801 

1.08381 

184 

N010902 

1.56357 

s.e. (a) 


s.e. (b) 


s.e. (c) 


0.11778 
0.13395 
0.10032 
0.10576 
0.10203 
0.06421 
0.15196 
0.16319 
0.20302 
0.22483 
0.12010 
0.17188 
0.12694 
0.10602 
0.12379 
0.22710 
0.21085 
0.18527 
0.20749 
0.13433 
0.11673 
0.13727 
0.19421 
0.19308 
0.20006 
0.12137 
0.08494 
0.08678 
0.17080 
0.19719 
0.13916 
0.11435 
0.12343 
0.16567 
0.19642 
0.34380 
0.19872 
0.25020 
0.19002 
0.12219 
0.11919 
0.15281 


-0.98525 
-2.39050 
-1.78932 
-1.24354 
-1.27067 
-2.50923 
-0.43338 
-0.09320 

0.75488 
-0.34990 
-1.45087 
-1.37708 
-1.40233 
-1.87239 
-0.65381 
-0.53268 
-0.16529 

0.03287 
-0.70162 
-2.22709 
-1.04928 
-1.09449 
-0.94025 
-0.04987 
-1.07518 
-1.93207 
-2.38310 
-1.48668 

0.13218 

0.46460 
-1.49019 
-1.19606 
-1.45957 
-1.11438 
-0.63370 

0.20854 
-0.25829 
-0.10064 
-0.06859 
-1.28254 
-0.47131 
-0.46713 


0.13713 

0.34204 

0.17324 

0.13802 

0.13970 

0.25598 

0.09699 

0.08727 

0.24155 

0.10912 

0.21000 

0.21643 

0.17247 

0.20702 

0.11174 

0.13089 

0.09722 

0.09564 

0.14664 

0.29571 

0.15973 

0.16480 

0.17915 

0.11066 

0.20659 

0.24523 

0.31786 

0.20893 

0.11255 

0.15326 

0.19000 

0.15414 

0.18367 

0.17388 

0.13572 

0.15336 

0.10519 

0.10586 

0.09829 

0.18812 

0.08698 

0.08701 


0.14020 

0.24019 

0.19373 

0.14826 

0.15626 

0.00000 

0.15397 

0.17778 

0.22631 

0.23959 

0.25570 

0.30102 

0.10485 

0.12966 

0.16395 

0.24930 

0.25808 

0.20890 

0.21082 

0.25930 

0.20639 

0.17234 

0.24113 

0.26735 

0.20880 

0.24403 

0.24783 

0.21889 

0.22214 

0.18965 

0.20367 

0.15642 

0.15940 

0.18153 

0.24626 

0.30609 

0.23369 

0.23522 

0.18417 

0.17860 

0.26010 

0.24105 


0.04300 

0.08756 

0.05559 

0.04118 

0.04339 

0.00000 

0.03091 

0.02989 

0.03200 

0.02739 

0.07574 

0.05431 

0.03564 

0.05296 

0.04143 

0.02757 

0.02879 

0.03109 

0.02946 

0.08622 

0.05932 

0.04708 

0.04220 

0.03663 

0.04202 

0.07778 

0.09312 

0.07731 

0.03717 

0.02710 

0.04622 

0.04854 

0.04832 

0.03226 

0.03552 

0.02271 

0.03268 

0.02634 

0.03126 

0.06280 

0.03530 

0.02630 


732 

779 


Table  B-8 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 


Item 

ETS  ID 

a 

185 

N010903 

1.85005 

186 

N010904 

1.52152 

187 

NOllOOl 

1.27933 

188 

N011002 

1.65730 

189 

N011003 

2.41596 

190 

N011004 

1.78824 

191 

NOlllOl 

1.56839 

192 

N011201 

0.91063 

193 

N011301 

1.65295 

194 

N011302 

0.99244 

195 

N011401 

0.83827 

196 

N011402 

0.82218 

197 

N011403 

0.97146 

198 

N011404 

1.32661 

199 

N012901 

1.09100 

200 

N013001 

1.01980 

201 

N013002 

0.97225 

202 

N013003 

1.71689 

203 

N013004 

0.99397 

204 

N013101 

1.75676 

205 

N013102 

1.40116 

206 

N013103 

0.95398 

207 

N013104 

0.75969 

208 

N013201 

1.66526 

209 

N013301 

1.23192 

210 

N013401 

1.20290 

211 

N013402 

1.43756 

212 

N013403 

1.49438 

222 

N014001 

1.23795 

223 

N014101 

0.75814 

224 

N014201 

1.20695 

225 

N014301 

1.75466 

226 

N014302 

1.07393 

227 

N014303 

1.72083 

228 

N014501 

0.43233 

229 

N014502 

0.93414 

230 

N014503 

0.62434 

231 

N015101 

0.93155 

232 

N015102 

2.53320 

233 

N015103 

2.40149 

234 

N015104 

1.70738 

235 

N015201 

1.08863 

s.e. (a) 


s.e.(b) 


s.e.(c) 


0.15682 
0.17022 
0.11570 
0.16726 
0.16943 
0.15874 
0.14127 
0.11742 
0.14312 
0.11855 
0.19405 
0.13917 
0.19233 
0.21546 
0.09082 
0.12202 
0.12117 
0.1C408 
0.11477 
0.13641 
0.14636 
0.09740 
0.11642 
0.21104 
0.16119 
0.17717 
0.18853 
0.22338 
0.15297 
0.07084 
0.13433 
0.19139 
0.13609 
0.18651 
0.06452 
0.12331 
0.13263 
0.11001 
0.23579 
0.19982 
0.18424 
0.12589 


-0.56355 
-0.24453 
-0.87939 
-0.31498 
-0.92768 
-0.54263 
-0.54055 
-0.25916 
-0.75604 
-0.42973 
0.69656 
0.01027 
0.62100 
0.49246 
-1.60092 
-0.34301 
-0.38259 
-1.12308 
-0.94626 
-1.56003 
-0.78899 
-0.86824 
-0.42068 
-0.69299 
-1.55713 
-0.25020 
-0.86245 
-0.27786 
-0.85718 
-1.28284 
-1.21830 
-0.81991 
-0.49827 
-1.04120 
-2.26356 
-2.66383 
-4.12019 
0.34271 
0.54834 
0.66008 
0.44096 
-0.76561 


0.09617 
0.08018 
0.11252 
0.08113 
0.14716 
0.09661 
0.08967 
0.08529 
0.11135 
0.08894 
0.22734 
0.10228 
0.18629 
0.15146 
0.16886 
0.08344 
0.08966 
0.17239 
0.13780 
0.19617 
0.J2368 
0.11733 
0.11125 
0.16001 
0.26807 
0.10663 
0.17513 
0.11622 
0.14869 
0.14222 
0.18886 
0.15771 
0.10782 
0.18841 
0.34802 
0.40566 
0.90265 
0.16850 
0.20683 
0.19694 
0.19311 
0.14968 


0.19270 
0.27499 
0.22839 
0.25211 
0.24143 
0.22609 
0.19668 
0.25976 
0.21079 
0.22689 
0.33376 
0.28837 
0.26972 
0.21998 
0.13573 
0.16456 
0.18704 
0.23372 
0.21625 
0.21484 
0.21834 
0.14741 
0.21563 
0.18120 
0.25257 
0.15710 
0.20481 
0.19858 
0.24936 
0.16926 
0.13647 
0.19017 
0.18111 
0.20824 
0.00000 
0.00000 
0.00000 
0.23395 
0.21641 
0.21907 
0.27788 
0.22684 


0.02240 
0.02425 
0.03695 
0.02193 
0.02440 
0.02276 
0.02517 
0.03693 
0.02826 
0.03887 
0.02999 
0.04148 
0.02544 
0.01852 
0.05331 
0.03631 
0.03960 
0.04150 
0.0557b 
0.05240 
0.03905 
0.04563 
0.05536 
0.03670 
0.07686 
0.03510 
0.04755 
0.03309 
0.04816 
0.05891 
0.05202 
0.03482 
0.04062 
0.04131 
0.00000 
0.00000 
0.00000 
0.06686 
0.03021 
0.02802 
0.04502 
0.08504 


733 

780 


Table  B-8 
(continued) 

Item  Parameter  Estimates  and  Standard  Errors 


Item 

ETS  ID 

a 

s.e. (a) 

b 

236 

N015502 

1.27279 

0.12588 

0.18925 

237 

N015503 

0.91211 

0.11941 

0.75605 

238 

N015504 

1.18882 

0.12068 

0.10997 

239 

N015505 

0.68340 

0.08254 

-0.17492 

240 

N015901 

1.02091 

0.13331 

0.37058 

241 

N015902 

1.38041 

0.16520 

0.72633 

242 

N015903 

1.18177 

0.12895 

1.10148 

243 

N015904 

0.65706 

0.06232 

0.85809 

244 

N016001 

1.04259 

0.12162 

0.03342 

245 

N016002 

1.38569 

0.15434 

1.24735 

246 

N016003 

0.90591 

0.10257 

0.35438 

247 

N016004 

1.09517 

0.12568 

0.10257 

248 

N016005 

1.73407 

0.17470 

0.15598 

249 

N016006 

1.35727 

0.13670 

0.42447 

250 

N017001 

1.51834 

0.15713 

0.48407 

251 

N017002 

1.93512 

0.13762 

1.10006 

252 

N017003 

1.83349 

0.12901 

1.76996 

s.e. (b) 


s.e.(c) 


0.14019 
0.21582 
0.13819 
0.14577 
0.20437 
0.23435 
0.22398 
0.10582 
0.16386 
0.27581 
0.15749 
0.16454 
0.15207 
0.16122 
0.17457 
0.19253 
0.24850 


0.20864 
0.24651 
0.22004 
0.24726 
0.33263 
0.31666 
0.15255 
0.00000 
0.28516 
0.45609 
0.20510 
0.27136 
0.23040 
0.20306 
0.21320 
0.19574 
0.17677 


0.05669 
0.05565 
0.06172 
0.08705 
0.06812 
0.04295 
0.03180 
0.00000 
0.07766 
0.02768 
0.06497 
0.07392 
0.05394 
0.04911 
0.04241 
0.02171 
0.01566 


734 


781 


GLOSSARY 


IMPLFMnNTTNG  TITH  NTiW  DnSIGN: 
THE  NAEP  1983-84  TECHNICAL  REPOKl 


GLOSSARY 


administratioih      The  conduct  of  a 
National  Assessment  session. 


Adminisimiion  Schedule,     A  list  of  the 
name,  age  and  sex  of  each  student 
invited  to  a  particular  assessment 
session. 


administrauon  time.     The  total  time 
allowed  for  an  item.  (Includes  the 
time  allowed  for  the  stimulus  and  the 
response.) 


administrauon  timetable.     Time  periods 
during  the  school  year  when  the 
various  grade/age  groups  are  assessed. 
The  time  periods  for  the  Year  15 
assessment  were:. 

Grade  8/Age  13 
October  10  to  December  16,  1983 

Grade  4/ Age  9 
January  2  to  March  9,  1984 

Grade  II /Age  17 

March  12  to  May  II,  1984 

administrative  units.     Geographic  areas 
such  as  states,  counties,  school 
districts,  etc. 


AERA,      American  Educational  Research 
Association. 


age-eligible.     An  individual  who  meets 
the  age  definition  for  one  of  the 
National  Assessment  populations: 
9-year-olds.  13-year-olds, 
l7-ycar-olds. 


aggregate  estimate.     Estimate  for  a 
combination  of  smaller  groups  for 
which  estimates  have  been  produced. 


allocation.     Apportionment  of  a  total 
sample  size  to  various  parts  of  the 
population  (See  Jinal  allocation.) 


almanacs.      The  sets  of  tables 
summarizing  NAEP  results. 


aiwhoring.      The  process  of 

characterizing  score  levels  in  terms 
of  predicted  observable  behavior. 


ARM,,     See  Average  Respottse  Method, 


assesstiwtu,,    Th*"  documentation  of  the 
progress  in  knowledge,  skills  and 
attitudes  of  American  youth.  Measures 
are  taken  at  periodic  intervals  for 
each  learning  area,  wiih  the  goal  of 
determining  trends  and  reporting  the 
findings  to  the  public  and  to  the 
education  community.  See  also 
Naiioiml  ,Assessment  of  Educatiotwl 
Progress, 


737 

783 


assessment  odnmistrator.  Individual 

employed  to  administer  (he  assessment 
in  participating  schools. 

assessment  session.     The  period  of  time 
during  which  a  NAEP  package  is 
administered  to  one  or  more 
individuals. 


Avetvge  Resffonse  Method,  A 

regress*l>ased  technique  for  predicting 
for  a  respondent  the  conditional 
distribution  of  an  average  score  on  a 
set  of  exercises  given  responses  to  at 
least  one  of  the  exercises  and  other 
information.  Used  to  produce  the  NAEP 
Year  15  Writing  Scale. 


average  sample  sine.    The  average  sample 
obtained  per  sampling  unit  selected.. 


background  and  attitude  itetns.  See 
non-cognitive  assessment. 


bias.     In  statistics,  the  difference 
between  the  expected  value  of  an 
estimator  and  the  population  parameter 
being  estimated.  If  the  average  value 
of  the  estimator  over  all  possible 
samples  (the  estimator  s  expected 
value)  equals  the  parameter  being 
estimated,  the  estimator  is  said  to  be 
unbiased;  otherwise,  the  estimator  is 
biased. 


BiB  {Balwiced  Incomplete  Block) 

spirallittg.  A  complex  variant  of 
multiple  matrix  sampling,  in  which  a 
small  subset  of  items  is  administered 
to  each  respondent  in  such  a  way  that 
each  pair  of  items  is  administered  to 
a  nationally  representative  subsample 
of  respondents. 


BILOC.     A  computer  program  for 

estimating  Item  parameters  by  marginal 
estimation  procedures. 


bloi'k.    A  group  of  assessment  items 
created  by  dividing  the  item  pool  for 
a  grade/age  into  subsets.  Used  in  the 
implementation  of  the  BIB  and  UBIB 
Spiral  sample  design. 

booklet.    The  assessment  instrument 
created  by  combining  blocks  of 
assessment  items. 


bridging.     An  administration  of  the 
same  set  of  exercises  under  two 
different  conditions  or  to  two 
different  populations  to  allow  a 
statistical  link  C  bridge**)  to  be 
established  between  results  under  the 
different  circumstances. 


calibrate.^     To  estimate  the  parameters 
of  a  set  of  items  from  responses  of  a 
sample  of  a  set  of  examinees. 

category  (scoring).    A  classification  of 

a  response  to  an  open-ended  item.  Sec 

Scoring  Guide, 


category  within  a  variable,.  A 

sub-classiflcation  within  a  variable, 
or  subgroup.  For  example,  Male  and 
Female  are  categories  of  the  subgro  ip 
Sex.  See  Reporting  Subgroups. 


cell.     The  smallest  unit  of  a  table. 

For  example,  a  two-way  table  with  5 
rows  and  7  columns  contains  .VS  cells 
(5x7  =  35). 


EMC 


784 


ceimn  tmcf  (CT).    SmilU  relatively 
permanent  areas  into  which  large 
cities  and  adjacent  areas  are  divided 
for  the  puipose  of  providing 
small-area  statistics.  The  average 
census  tract  contains  approximately 
4,000  residents. 


ciustering.    The  process  of  forming 
sampling  units  as  groups  of  other 
units. 


codebook.    A  printout  of  the  raw  data 
files  for  each  student,  excluded 
student,  teacher  and  school  in  a 
particular  grade/age. 


cacffident  of  variation.     The  ratio  of 
the  standard  deviation  of  an  estimate 
to  the  value  of  the  estimate. 


cognith'e  assessment.    The  portion  of  the 
Year  15  NAEP  which  assessed  students* 
abilities  in  the  learning  areas  of 
reading  and  writing. 


cotnbmed  ntiio  estimator.    The  ratio 
estimator  resulting  from  first 
estimating  the  numerator  and  the 
denominator  values  and  then  using  the 
quotient  of  these  as  the  estimate  of 
the  ratio. 


comtnon  block.    A  group  of  haclcground 
items  included  in  the  beginning  of 
every  assessment  boolclet. 


comi)lete  etiumeratim  survey.     Survey  in 
which  the  entire  population  is 
enumerated  or  surveyed;  a  census.. 


conditional  probability.  Probability  of 
an  event,  given  the  occurrence  of 
another  event. 


cotiditioning  wnablcs.  Demogmphic 
vnriaMes  chnrnctcri^ing  respondent. 
Used  in  construction  of  plausible 
values. 


cotftrollcd  selection.     A  method  of 
probability  sampling  involving 
balanced  samples  on  asymmetrical 
controls.  Further  controls  beyond 
siraiiHcation  are  used. 


CPS.      See  Current  roptilation  Survey. 


Current  ropulafion  Survey.      A  household 
sample  survey  conducted  monthly  by  the 
Bureau  of  the  Census  to  provide 
estimates  of  employment,  unemployment, 
and  other  characteristics  of  the 
general  labor  force,  of  the  population 
as  a  whole,  and  of  various  subgroups 
of  the  population.. 


CK      See  coejpcient  of  variation. 


data  editing.  The  process  by  which 
assessment  responses  and  other 
information  are  verifled. 


data  entry.     The  process  by  which 
assessment  responses  and  other 
information  are  transferred  from  paper 
to  computer.. 


degrees  of  freedom.      [of  a  variance 

estimator].  The  number  of  independent 
pieces  of  information  used  to  generate 
a  variance  estimate.  For  the 
jaclclcnife  variance  estimator  used  in 
Year  15  NAEP,  this  is  at  most  32.  the 
number  of  PSU  pairs. 

demographic  subgioups.      See  repot  ting 
subgioups. 


739 


785 


*  ivfif  yariables.     Subgroup  data  thai 
were  ikm  obtained  directly  from 
assessment  responses,  but  through 
procedures  of  interpretation, 
classification  or  calculation.  See 
also  rtponing  stibgroiips. 


design  effects.     The  ratio  of  ihe 

variance  for  the  sampk  design  to  the 
variance  for  a  simple  random  umple  of 
the  same  size. 


dislmcicr.     An  incorrect  response 

choice  included  in  a  multiple-choice 
exercise. 


eiury  nunle,  Proccs5ing  option  under  the 
dnta  entry  jyMcm:  used  for  the 
initial  transcription  of  assessment 
data. 


£rS.      Sec  Ediicalioiml  Tesiing  Senice,, 


eAoininee.  Same  as  respondent. 


Excluded  Sindetu  Questionnaire.  An 
instrument  used  in  the  Year  IS 
assessment:  completed  for  every 
student  who  was  sampled  but  was 
excluded  from  the  assessment. 


District  SitpenHsor.    One  of  16 

supervisors  responsible  for  contacting 
schools,  arranging  and  conducting 
introductory  meetings,  recruiting, 
training  and  pcxyviding  support  to 
Exercise  Administrators,  distributing 
and  collecting  questionnaires, 
completing  administrative  reporting 
forms,  and  packing  and  shipping  all 
materials  to  ETS. 


doiible-length  block.     A  group  of 

assessment  exercises.  28  minutes  long, 
created  to  accommodate  the  use  of 
longer  exercises;  used  in  UBIB  spiral 
administration. 


EA.      See  Exercise  Adininistraior. 

ECS.     See  Education  Commission  of  tl^e 
States. 

Education  Commission  oftlie  States. 

The  NAEP  grantee  prior  to  Year  15. 


Educatiottal  Testing  Service.    The  NAEP 
grantee  for  Year  IS. 


cxcttided  students.     Sample  students  who 
were  determined  by  the  school  to  be 
unable  to  paticipaie  because  they  had 
limited  English-speaking  ability,  were 
educable  mentally  retarded,  or 
functionally  disabled. 


Exercise  Administrator.     The  person 
whose  primary  function  was  to 
administer  the  assessment  booklets  to 
the  sample  students. 


exercise,    A  task  designed  to  measure  an 
objective.  Because  NAEP  does  not 
administer  "tests."  but  instead 
describes  educational  achievement  over 
time,  the  term  "exercise"  is  often 
used  instead  of  the  term  "item"  or 
"test  item."  The  terms  "item"  and 
"exercise"  are  used  synonymously  in 
this  report. 


exercise  booklet.     See  booklet. 


exercise  part.    Each  portion  of  an 
exercise  that  asks  a  separate 
question.  Part^  may  all  pertain  to 
one  stimulus,  such  as  a  graph  or  a 
table,  or  may  concern  the  same  topic. 


740 


exercise  pool.     The  entire  set  of 
exercises  prepared  for  a  learning 
area.  This  set  includes  recycled 
exercises  developed  for  previous 
messments  but  not  used  due  )o 
exercise  booklet  or  budgetary 
constraints  and  newly  developed 
exercises. 


expected  yedue.    The  average  oT  the 

sample  estimates  given  by  an  estimator 
over  all  possible  sampfes.  If  the 
estimator  is  unbiased,  then  its 
expected  valu;;  will  equal  the 
population  value  being  estimated. 


eiifo  sifbsampUng,    Subsampling  to 
oburin  smaller  than  desired 
subsampiing  fractions.  Occasionally 
used  in  schools  wHh  an  unusually 
large  amount  of  race  *  growth  in 
numbers  of  students  in  order  to  reduce 
worUotid. 


field  lesf.    A  pretest  of  exercises  lo 
oMn  information  regarding  clarity. 
dSrRcuhy  levels,  timing,  feasibility 
and  special  administrative  problems 
needed  for  revision  and  selection  of 
exercises  lo  be  used  in  the 
assessnient. 


final  alloccHioft.     Usually  determined  by 
rounding  or  adjusting  a  preliminary 
sample  allocation  to  integer  numbers 
See  allocaiion. 


first  sfage  sampling  unit  Sec 
nuihi'Sknfe  sample  design. 


foils.    The  correct  and  incorrect 
response  choices  included  in  a 
multiple-choice  exercise. 


fotirtii'Skigc  sampling  unit.  See 
nmhi  siagc  sample  design. 


free-response  Hem.  Same  as  open-ended 
re^pofw  item. 


grade-eligible.     An  individual  who  meets 
the  grade  deflnitlon  for  one  of  the 
Year  15  National  Assessment 
populations:  Grade  4,  Grade  8.  or 
Grade  IL 


grade/age-eligible.     A  student  who  meets 
the  age  or  grade  definition  for  one  of 
the  Year  15  National  A.^iessment 
populations:  Grade  4  or  Age  9.  Grade 
8  or  Age  13.  Grade  II  or  Age  17. 


grotip  oibninisiered  package.     A  package 
containing  exercises  which  can  be 
administered  to  groups  of  students. 


gronp  effea.      The  difference  between 

the  mean  for  a  group  and  the  mean  for 
the  nation. 


holistic  scoriffg.^    A  method  of  scoring 
open-ended  response  exercises  chat 
evaluates  a  response  on  the  basis  of 
overall  impression. 


impiiiaiion.      Prediction  of  a  missing 
value  according  tc  5one  procedure, 
using  a  mathematical  model  in 
combination  with  available 
information.  See  plausible  valm^s. 


nnputed  race/eihnicity.  The 

race/ethnicity  of  an  assessed  student, 
as  derived  from  his  or  her  responses 
to  three  particular  common  background 
items.  A  Year  15  ''eportittg  subgroup. 


787 


in-school  sample  design.     Sample  design 
for  the  National  Assessment  school 
survey.  Sec  smple  design. 

individual  completion  rate,.  Proportion 
of  eligibles  in  the  sample  who  respond 
by  completing  one  or  more  assessment 
packages. 


ineligible,     Studen*  who  is  not  eligible 
for  National  Assessment  because  he  or 
she  docs  not  satisfy  grade  or  age 
requirements  (sec  grade/age-eligible). 


informative  writing,     A  writing 

objective  of  the  Year  15  assessment; 
writing  that  is  used  to  share 
knowledge  and  convey  messages, 
instructions  and  ideas. 


intelligent  data  entry  system,     A  set 

of  computer  programs  and  procedures 
developed  in  accordance  with  the  NAEP 
design  to  validate,  verify,  transcribe 
and  check  for  the  reasonableness  of 
available  data. 

!RT,      Sec  item  response  theoty. 


item.      See  exercise,. 


item  block.  See  block. 


item  booklet.     See  booklet.. 


item  pan..     See  e,xercise  part. 


item  pool.     See  exercise  pool.. 


item  response  theory.      Test  analysis 

procedures  that  as<5ume  a  maethematical 
model  for  the  probability  that  given 
examinee  will  respond  correctly  to  a 
given  exercise.. 


jackknife,,    A  procedure  to  estimate 
standard  errors  of  percentages  and 
other  statistics.  Particularly  suited 
to  complex  sample  designs. 


learning  area.     One  of  the  areas 

assessed  by  National  Assessment,  e.g., 
art.  career  and  occupational 
development,  citizenship,  literature, 
mathematics,  music,  reading,  science, 
social  studies  and  writing. 


literary  writing.     In  the  Year  15 

assessment,  writing  from  a  basis  of 
experience  and  imaginative  ideas  to 
share  experiences  and  understand  the 
world. 


LOGIST.     A  computer  program  for 
estimating  item  parameters  by  joint 
estimation  procedures. 


machine-readable  catalog.      Year  15 
computer  processing  control 
information,  IRT  parameters,  foil 
codes  and  labels  in  a 
computer-readable  format. 


major  strata.     Used  to  stratify  the 

primary  sampling  frame  within  each 
region.  Invv')lves  stratification  by 
size  of  community  and  degree  of 
ruralness  (SDOC). 


marginal  value.,  A  row  or  column  total, 
the  sum  of  all  cell  values  in  the  row 
or  column. 


742 


ERIC 


78"d 


meanparts  estimator.     Estimates  a 

subgroup  average  score  across  a  set  of 
items  by  the  average  of  the  subgroup 
scores  for  each  of  the  items.  Can  be 
extended  to  any  linear  estimator. 


mchmics  scoring,     A  method  of  scoring 
open-ended  response  exercises  that 
evaluates  elements  of  sentence 
construction,  word  choice,  spelling, 
punctuation  and  capitalization. 


multipk'Couiuy  PSU.     A  primary  sampling 
unit  (PSU)  composed  of  two  or  more 
counties. 


ME.      National  Institute  of  Education. 


nine-year-olds.    One  of  the  National 
Assessment  target  populations..  For 
Year  IS,  deflned  as  persons  born 
during  calendar  year  1974. 


modal  age.    The  age  of  the  majority  of  a 
group  of  grade-eligible  students:  Age 
9  for  fourth  graders.  Age  13  for 
eighth  graders  and  Age  17  for  eleventh 
graders. 


non-cognitive  assessment.  The 

background  questions  used  to  collect 
information  from  students  about 
activities,  attitudes  and 
demographics. 


modal  grade.     The  grade  attended  by  the 
majority  of  a  group  of  age-eligible 
students:  the  fourth  grade  for 
9-year-olds,  the  eighth  grade  for 
13-year-olds  and  the  eleventh  grade 
for  17-year-olds. 


mode  of  administration.     The  method  by 
which  students  are  administered 
assessment  instruments:  in  Year  15  the 
modes  of  administration  were  spiralled 
and  taped.; 


multi-stage  sample  design.  Indicates 

more  than  one  stage  of  sampling.  An 
example  of  three-stage  sampling:  I) 
sample  of  counties  (primary  sampling 
units  or  PSUs);  2)  sample  of  schools 
within  each  sample  county;  3)  sample 
of  students  within  each  sample  school. 


multiple  matrix  sampling.  Sampling 
plan  in  which  different  samples  of 
respondents  take  different  samples  of 
items. 


nonresponse.     The  failure  to  obtain 
responses  or  measurements  for  all 
sample  elements. 


nonsampling  error,^    A  general  term 
applying  to  all  sources  of  error 
except  sampling  error.;  Includes 
errors  from  defects  in  the  sampling 
frame,  response  or  measurement  error, 
and  mistakes  in  processing  the  data. 


objective,     A  desirable  education  goal 
agreed  upon  by  scholars  in  the  field, 
educators  and  concerned  lay  persons, 
and  established  through  the  consensus 
approach.. 


objectives  re-development.     A  review  of 
the  learning  area  objectives  following 
the  initial  assessment  of  a  learning 
area:  carried  out  by  scholars  in  the 
Field,  educators  and  concerned  lay 
persons.  May  result  in  revision, 
modification  or  total  rewriting  of  the 
learning-area  objectives  to  reflect 
current  curricular  goals  and  emphases. 


743 

789 


observational  imiu     The  itidividual 
units  for  which  characteristics  are 
observed  or  measurements  are  obtained. 


observed  race/eihniciiy,  Race/eth  nicity 
of  an  assessed  as  perceived  by  the 
Exercise  Administrator., 


OERL     Office  for  Educational  Research 
and  Improvement. 


0MB.     OfHce  of  Management  and  Budget. 


open-ended  response  item,  A 

non-multiple-choice  exercise  that 
requires  some  type  of  written  or  oral 
response. 


overswnpling.    Deliberately  sampling  a 
portion  of  the  population  at  a  higher 
rate  than  the  remainder  of  the 
population.: 


paced  tape,,    A  tape  recording 

accompanying  each  tape  administration 
package  to  assure  uniformity  in 
administration.  Instructions  are 
played  back  from  the  tape  recording  to 
prevent  reading  difficulties  from 
interfering  with  an  individual's 
ability  to  respond.  Includes  response 
time. 


parental  education,,    The  level  of 

education  of  the  mother  and  father  of 
an  assessed  student  as  derived  from 
the  student's  response  to  two 
assessment  items.  A  Year  15  reporting 
subgroup. 


participant,.    Sec  respondent.. 


pvrccnKorrcct,     The  estimated 

proportion  of  a  target  population  who 
would  answer  a  particular  exercise 
correctly. 


persuasive  writing.,     A  writing 

objective  of  the  Year  15  assessment. 
Writing  that  attempts  to  breing  about 
some  action  or  change. 


plausible  valttes.      Proficiency  values 
drawn  at  random  from  a  conditional 
distribution  of  a  NAEP  respondent 
given  his  or  her  response  to  cognitive 
exercises  and  a  specified  subset  of 
background  variables  (conditioning 
variables).  The  selection  of  a 
plausible  value  is  a  form  of 
imputation. 


population.    An  aggregate  of  elements, 
usually  individual  units  with 
associated  characteristics  for 
observation  or  measurement. 


post-strati ftcation.     Classification  and 
weighting  of  selected  sampling  units 
by  a  set  of  strata  definitions  after 
the  sample  has  been  selected. 


PPS,      Probability  Proportional  to  Size. 


precision.     The  expected  difference 
between  the  expected  value  and  the 
sample  estimate  of  a  population  value, 
as  measured  by  the  sampling  error. 


Primary  Sampling  Unit.      A  primary 
sampling  unit.  This  is  the  basic 
geographic  sampling  unit  for  National 
Assessment.  A  PSD  is  either  a  single 
county  or  a  set  of  contiguous 
counties.  See  also  multi-stage  sample 
design. 


744 

73o 


primary  trail  scoring,    A  method  of 

scoring  open-ended  response  exercises 
by  evaluating  the  ability  to  write  for 
precisely  defined  purposes.  Criteria 
for  evaluating  responses  are 
associated  with  specific  point  scores 
in  8  scoring  guide. 


PUDT,:     See  public-use  data  tapes. 


QED,     Quality  Education  Data,  Inc.  A 
suplier  of  lists  of  schools  and  school 
districts  with  school  data  for  Year 
15. 


Principal  Questionnaire,     A  data 
collection  form  given  to  school 
principals  prior  to  assessments.  The 
principals  respond  to  questions 
concerning  enrollments,  size  of  the 
community,  occupational  composition  of 
the  community,  etc.: 


Probability  Proportional  10  Estimated  Size 
(PPES).     Selection  metho^J  where 
probabilities  of  selection  for 
sampling  units  are  assig'^^J  in 
proportion  to  the  magnitude  of  the 
estimated  size  measure  for  each  unit 


Probability  Sample,    A  sample  in  which 
every  clement  of  the  population  has  a 
known,  non-zero  probability  of  being 
selected. 


proportional  allocation..  Allocation  of 
a  sample  to  strata  in  proportion  to 
observational  units  in  each  stratum 


pseiido-replicate.      The  value  of  a 

statistic  based  on  an  altered  sample. 
Used  by  the  jackknife  variance 
estimator. 


PSU.      See  primary  sampling  unit. 


publiC'Use  data  tapes.      Computer  tapes 
containing  respondent-level  cognitive 
item,  background  and  attitude  and 
demographic  data.  Available  for  use 
by  researchers  wishing  to  do  secondary 
analyses  of  NAEP  data. 


mndom  variable.  A  variable  which  takes 
on  any  value  of  a  specified  set  with  a 
particular  probability. 


reading  proficiency  scale.      Scale  (0  to 

500)  based  on  IRT  upon  which  levels  of 
reading  performance  can  be  measured. 


receipt  control.     Procedures  used  by 
scoring  staff  to  check  in  and  screen 
field  materials.  Information  from 
these  procedures  is  relayed  to 
assessment  administrative  staff  so 
that  any  errors  may  be  corrected. 


recycled  exercises.     The  set  of 

exercises  that  is  kept  secure  from  one 
assessment  to  the  next  that  will  be 
used  to  measure  changes  (growth, 
stability  or  decline)  in  performance 
for  the  learning  area. 


region.     One  of  four  geographical 
regions  used  in  gathering  and 
reporting  data:  Northeast,  Southeast, 
Central  and  West  (as  defined  by  the 
Office  of  Business  Economics,  U,  S. 
Department  of  Commerce),  A  Year  15 
reporting  stibgroup. 


released  item.     An  item  for  which 
results  and  item  text  have  been 
reported  to  the  public. 


745 

791 


reliabilhy  check.    The  scoring  of 

open-ended  response  items  by  a  second 
scorer.  In  Year  15,  twenty  percent  of 
these  items  underwent  reliability 
checks. 


reporting  subgroups.     Groups  within  the 
national  population  for  which  National 
Assessment  data  are  reported:  sex, 
race/ethnicity,  grade,  age,  level  of 
parental  education,  region,  and  size 
and  type  of  community. 


response  experience.     Rcsjwnse  rates 

observed  in  previous  surveys  which  are 
used  for  planning  purposes. 


response  options.  Different 

alternatives  to  a  multiple-choice 
question  that  can  be  scicctcd  by  the 
respondent. 


response  rale..    Proportion  of  specified 
units  for  which  responses  or 
measurements  are  obtained. 


rescore.    If  an  open-ended  exercise  was 
scored  under  different  conditions  than 
presently  held  or  if  passage  of  time 
may  affect  scorings  responses  from  an 
earlier  assessment  may  be  rescorcd  at 
the  same  time  as  responses  from  a 
later  assessment.  Responses  from  an 
earlier  assessment  also  may  be  held 
and  not  scored  so  that  they  can  be 
scored  with  responses  from  a  later 
assessment. 


review  conference..    A  conference  held  to 
review  the  objectives  of  a  learning 
area  to  assure  their  acceptance  as 
measures  of  the  objectives  by 
scholars,  educators  and  lay  persons  or 
(0  review  exercises  for  racial, 
ethnic,  social  or  regional  bias.: 


RP  scaie. 
Hale 


Sec  reading  proficiency 


Research  Triangle  hstittite..     The  NAEP 
survey  subcontractor  prior  to  the  Year 
15  assessment;  drew  the  sample  of  PSUs 
and  schools  for  the  Year  15 
assessment.. 


resobition  mode.      Processing  option 

under  the  data  entry  system;  used  for 
the  correction  of  erroneous  or 
discrepant  data  values. 


respondent.     A  person  who  is  eligible 
for  National  Assessment,  is  in  the 
sample,  and  who  responds  by  completing 
one  or  more  items  in  an  assessment 
booklet. 


RTI.      Sec  Research  Triangle  Institute. 


sample  design  parameter.     A  population 
parameter  or  a  survey  parameter,  such 
as  an  expected  response  rate,  used  in 
designing  a  sample. 


swnple  design.     Specifications  for 

selecting  a  sample  plus  specifications 
for  processing  the  sample  data  to  make 
estimates.  Sec  sampling  plan. 


sample  size.^ 
sample 
size.) 


The  number  of  units  in  the 
(Sec  also  average  sample 


response  error.    The  difference  between 
the  observed  value  and  the  true  value 
for  an  observational  unit.. 


746 


ERIC 


792 


sample  survey.     As  opposed  to  a  census, 
a  data  collection  process  whereby  only 
a  sample  of  the  population  is  observed 
or  measured. 


sample,     A  portion  of  a  population,  or  a 
subset  from  a  set  of  units,  selected 
by  some  probability  mechanism  for  the 
purpose  of  Investigating  the 
properties  of  the  population.  NAEP 
does  not  assess  an  entire  grade/age 
population  but  rather  selects  a 
representative  sample  from  the 
grade/age  group  to  answer  assessment 
items. 


sampling  error.     The  error  in  survey 
estimates  that  occurs  because  only  a 
sample  of  the  population  is  observed. 
Measured  by  standard  error  and 
variance. 


sampling  frame.^    The  list  of  sampling 
units  from  which  the  sample  is 
selected. 


sampling  plan.  Set  of  specifications 
and  procedures  used  to  select  a 
sample.  See  satnple  design. 


School  Characteristics  and  Policy 
Questionnaire,     A  five-page 
questionnaire  completed  for  each 
school  by  the  principal  or  other 
official:  used  to  gather  information 
concerning  school  administration, 
staffing  patterns,  English  curriculum 
and  student  services. 


school  district.     Administrative  unit  of 
the  public  school  system,  usually 
involving  a  school  system  under  a 
single  district  organization.. 


school  response  rate.,     The  response  rate 
for  a  sample  of  schools,  (See 
response  rate.) 


scoring  guide.     A  guide  for  hand  scoring 
an  open-ended  response  item  that 
specifies  descriptive  or  diagnostic 
categories  by  giving  definitions  and 
example  responses. 


second-stage  sampling  unit.  See 
multi-stage  sample  design. 

Secondary  Sampling  Unh  (SSU).  See 
nwlti-sta^,e  sample  design. 

secoitdaty  traits.      Characteristics  of  a 
response  to  an  open-ended  exercise 
indicating  the  presence  or  absence  of 
elements  that  are  of  special 
significance  to  the  exercise. 


secure  items.     Items  not  release  for 
public  use.  in  order  to  be 
readministered  in  subsequent 
assessments  to  determine  whether 
performance  levels  have  increased, 
decreased  or  remained  the  same. 


selection  probability.     The  probability, 
or  chance,  that  a  particular  sampling 
unit  has  of  being  selected  in  the 
sample. 


SES.      See  socioeconomic  status.. 


session.     See  assessment  session. 


seventeen-year-old.     One  of  the  National 
Assessment  target  populations..  For 
Year  15.  defined  as  persons  born  from 
October  I,  1966  to  September  30.  1967. 


747 

793 


sex.     One  of  the  NAEP  reporting 

subgroups.  Assessment  results  are 
reported  for  males  and  females. 


simple  random  sample.     Process  fur 
selecting  n  sampling  units  from  a 
population  of  N  sampling  units  so  that 
each  sampling  unit  has  an  equal  chance 
of  being  in  the  sample  and  every 
combination  of  n  sampling  units  has 
the  same  chance  of  being  in  the  sample 
chosen.: 


single-lengfli  l}loclc.    A  group  of 

assessment  items,  14  minutes  long, 
containing  an  average  of  12  minutes  of 
reading  and  writing  exercises  and  two 
minutes  of  background  and  attitude 
questions. 


Size  and  Type  of  Community  (STOCh  One 
of  the  NAEP  reporting  subgroups, 
dividing  the  communities  in  the  n  ttion 
into  seven  groups  based  on  size  and 
other  characteristics. 


size  measure.    Value  of  a  variable  used 
to  determine  the  allocation  of  the 
sample  to  strata  or  used  to  assign 
selection  probabilities  to  sampling 
units  within  a  stratum. 


size  stratum,  A  stratum  based  upon  the 
value  of  the  size  measures  for  units 
placed  in  the  same  stratum;  e.g.,  a 
stratum  for  the  largest  units.. 


SM5A,     Sec  standard  metropolitati 
statistical  area. 


Socioeconomic  Status  (SES),  For 

sampling,  the  lower  SES  portion  of  the 
population  (approximately  20  percent) 
is  considered  a  subpopulation  to  be 
sampled. 


SSU  size  measure.     Measure  of  size  for  a 
secondary  sampling  unit  (SSU). 


standard  error,     A  measure  of  sampling 
variability  for  a  statistic.  Because 
of  NAEP  s  complex  sample  design, 
standard  errors  are  estimated  by 
jackknifing  the  samples  from 
first-stage  sample  estimates. 


Standard  Metropolitan  Statistical  Area 
(SMSA),     An  area  defined  by  the 
federal  government  for  the  purposes  of 
presenting  general-purpose  statistics 
for  metropolitan  areas.  Typically,  an 
SMSA  contains  a  city  of  at  least 
50,000  population  plus  adjacent  areas. 


stem.  The  portion  of  an  item  that 
states  the  problem  or  asks  the 
question. 


stimulus.     For  reading  items,  a  visual 
stimulus  used  as  part  of  the  stem. 


STOC.      See  size  and  type  of  community. 


stratification.     The  division  of  a 

population  into  parts,  called  strata. 


stratified  sample.     A  sample  selected 
from  a  population  which  has  been 
stratified  with  a  sample  selected 
independently  in  each  stratum.:  The 
strata  are  defined  for  the  purpose  of 
reducing  sampling  error. 


student  frame.     List  of 

grade/age-eligible  students  within  a 
sample  school.. 


748 


ERLC 


794 


student  ID  number.     A  unique 

identification  number  assigned  to  each 
respondent  to  preserve  his  or  her 
anonymity.  NAEP  docs  not  record  the 
names  of  any  respondents. 


systematic  sample  (systematic  random 
sample).    A  sample  selected  by  a 
systematic  method;  for  example,  when 
units  ai^  selected  from  a  list  at 
equally  spaced  intervals. 


student  response  rate,.    The  response 
rate  for  a  sample  of  students.  See 

response  rate.. 


study  skill  itan.     An  item  requiring  a 
special  learned  skill  beyond  the 
facility  of  recognizing  and 
understanding  the  printed  word;  for 
example,  the  interpre^tion  of  a  bar 
graph,  telephone  b\\\  or  table  of 
contents.: 


subgroup.      Sec  reporting  subgroup. 


subject  areas.     Sec  learning  areas. 


subpopulation.      See  reporting  subgroup. 


subsainpling.     Selection  of  a  sample  from 
a  larger  sample.  Also  used  to 
describe  multi-stage  sampling., 


subseginenting.     Operation  of  subdividing 
the  area  of  a  segment  into  several 
subareas  and  selecting  one  of  the 
subareas. 


survey  design.     All  specifications  and 
procedures  invoK^d  in  a  survey. 


survey  population.     The  population 
actually  surveyed  or  represented  by 
the  sample.  May  Jiffer  from  the 
target  population. 


TAC,     See  Technical  Advisory  Committee. 


tapescript.     A  script  prepared  for  the 
announcer  to  use  in  producing  the 
paced  tape,  indicating  exactly  what  is 
to  t>e  read  or  not  read  aloud  to  the 
students  as  well  as  the  amount  of 
response  time  allowed  for  each 
exercise.  See  paced  tape. 


target  population.     Same  as  population. 


Teacher  Questionnaire.      A  nine-page 
questionnaire  completed  by  selected 
English  and  Language  Arts  teachers; 
used  to  gather  information  concerning 
year  of  teaching  experience,  frequency 
of  writing  assig.'^mrnts.  teaching 
materials  used,  and  the  availability 
and  use  of  computers. 


Technical  Advisory  Committee.,  Committee 
of  experts  in  areas  of  educational 
policy  and  procedures,  mathematics, 
and  measurement  theory;  provides 
advice  and  recommendations  concerning 
NAEP  staff  technical  plans  such  as 
sampling,  program  implementation  and 
analyses. 


theta  scale.      A  rcscaling  of  the 
Reading  Proficiency  scale  that 
standardizes  the  combined  age  and 
grade  samples.  Item  response  theory 
calculations  are  carried  out  in  the 
theta  scale  for  mathematical 
convenience,  then  transformed  to  the 
Reading  Proficiency  scale  for 
reporting  purposes. 


749 


795 


ihird^siage  sampling  unit.  Sec 
multi-stage  sample  design. 


thirteen-year-olds.    One  of  the  National 
Assessment  target  populations.  For 
Year  15,  defined  as  persons  born 
during  calendar  year  1970. 


T-unit.     Used  to  assess  the  quality  of 
syntax  used  i.i  an  essay;  an 
independent  clause  and  all  of  its 
modifying  words,  phrases  and  clauses. 


UBIB  {Unbalanced  Incomplete  Block) 

spiralling.      Refers  to  a  portion  of 
the  spiral  design  in  which  each 
booklet  contains  a  common  block  of 
background  questions,  a  single-length 
block  of  assessment  exercises,  and  a 
double-length  block  of  assessment 
exercises. 


weight.    A  multiplicative  foctor  equal 
to  the  reciprocal  of  the  probability 
of  n  respondent  being  selected  for 
assessment  with  adjustment  for 
nonresponse  and  perhaps  also  for 
iv>5t.stratification;  an  estimate  of 
the  number  of  persons  in  the 
population  represented  by  a  respondent 
in  the  sample.  The  sum  of  weights  for 
all  respondents  at  an  age  level  is  an 
estimate  of  the  number  of  persons  in 
the  country  at  that  age  level. 


Weighted  Average  Response  Method  {WARM), 
A  generalization  of  the  Average 
Response  Method  allowing  the 
estimation  of  weighted  averages. 


Westat,  Inc,      The  NAEP  sur\'ey 
subcontractor  for  the  Year  15 
assessment. 


unequal  probability  sampling,,    A  sample 
selection  procedure  in  which  the 
sampling  units  have  assigned  selection 
probabilities  which  are  not  equal  for 
all  units. 


user  tapes.     Sec  public-use  data  tapes. 


variance.     The  square  of  the  standard 
error;  the  average  of  the  squared 
deviations  of  a  random  variable  from 
4he  expected  value  of  the  variable. 


verification  mode.     Processing  option 
under  the  data  entry  system;  used  for 
the  substantiation  of  data  values. 


WARM.      See  Weighted  Average  Response 
Method. 


Winsorizing,      Replacement  of  data 

values  which  are  more  extreme  than  a 
given  threshold  by  that  threshold 
value.  Bounds  the  influence  of 
extreme  data  values  on  an  estimator 
while  maintaining  information  on  the 
sign  of  the  values. 


writing  scale.      Scale  based  on  Average 
Response  Method  upon  which  levels  of 
writing  performance  can  be  measured. 

Year  OL  02.  03. .J5.    A  sequential 
number  assigned  to  each  period  of 
assessment  activities  in  the  field. 
Year  15  pre-assessment  activities 
began  in  May  1983;  assessment 
activities  beg^n  in  August  1983  and 
ended  in  May  1984. 


750 


LIST  OF  REFERENCES 


797 


ERIC 


IMPLEMENTING  THE  NEW  DESIGN: 
THE  NAEP  1983-84  TECHNICAL  REPORT 


ERIC 


LIST  OF  REFERENCES 


American  Psychological  Association  (1985).  Standards  for  EducQiional  and  Psychological 
Testing,  Washington,  DC:  Author, 

Applcbcc,  A.  N.,  Langcr,  J,  A.,  &  Mullis,  I.  V.  S.  (1986a).   Writmg:  Trends  across  the 
decade.  1974-1984.  (NAEP  Report  15- W-O I).:  Princeton,  NJ:  Educational  Testing 
Service. 

Applebcc,  A.  N.,  Langer,  J  A.,  &  Mullis,  I.  V.  S.  (1986b).:  Tl\e  writing  report  card: 
Writing  achievement  in  American  schools.  1984.  Educational  Tesiing  Service. 

Barone,  J.  L.,  Norris,  N.  A.,  &  Rogers,  A.  M.:  (1986).  NAEP  1983-84  public-use  data 
tapes  version  3.1  users' guide.  Piinceton,  NJ:  Educational  Testing  Service. 

Bcall,  G.  (1971).  Change-over  experiments  in  practice.  (ETS  Research  Report  RB7 1 -38). 
Princeton,  NJ:  Educational  Testing  Service. 

Beaton,  A.;  E.  (1964).   Tlie  use  of  special  matrix  operators  in  statistical  calculus. 

(ETS  Research  Bulletin  64-51).  Princeton,  NJ:  Educational  Testing  Service, 

Beaton,  A.  E.  (1973).  F4STAT  statistical  system.  Proceedings  of  Computer  Science  atxd 
Statistics:  7th  Annual  Symposium  on  the  Interface  (pp.  279-282).  Ames,  lA:  Iowa 
State  University, 

Beaton,  A.  E.  (1984).  Statistical  issues  in  data  analysis  for  the  National  Assessment  of 
Educational  Progress.  Paper  presented  at  the  annual  meeting  of  the  American 
Statistical  Association,  Philadelphia.  August  1984. 

Beaton,  A,  E.  (1985).  NAEP  analysis  procedures  and  methodolegy.  Paper  presented  at  the 
annual  joint  meeting  of  the  AERA  and  NCME,  Chicago,  April  1985. 

Beaton,  A.  E,  (1986).  Behanc.  a  program  to  assist  in  behavioral  wichoring  [Computer 
program],  Princeton,  NJ:  Educational  Testing  Service. 

Beaton,  A.E.,  Mislevy,  R.  J.,  Kaplan,  B.  &  Sheehan,  K.  M.  (1986),  Estimating  group 
effects  from  sparse,  fallible  assessment  data:  Procedures  and  methodology. 
Princeton,  NJ:;  Educational  Testing  Service. 

Bejar,  I.  I.  (1980).  A  procedure  for  investigating  the  unidimcnsionality  of  achievement 
tests  based  on  item  parameter  estimates.  Jotaml  of  Educational  Measurement  17 
283-296. 

Birnbaum,  A.  (1968),  Some  latent  trait  models  and  their  use  in  inferring  an  examinee's 
ability.  In  F.  M.  Lord  &  M.  R.  Novick,  Statistical  theories  of  mental  test  uorci. 
Reading,  MA:  Addison-Wesley, 


753 


798 


Bock,  R.  D.,  &  Aitkin.  M.  (1981).  Marginal  maximum  likelihood  cstimaiion  of  item 
panmeters:  Application  of  an  EM  algorithm.  Piychomctrika.  46.  44.1-459, 

Bock,  R.  D.,  GiMwns.  R.  D..  and  Miiraki,  E.  (1985).  FulNnfanrntion  ilcm  factor 

analysis  (MVLC  Report  No..  85-1).  Chicago:  National  Opinion  Research  Center. 

Bock,  R.  D.,  &  Liebcrman,  M.  (1970).  Fitting  a  response  model  for  n  dicholomously 
scored  Items.  Psychometrika,  35.  179-197. 

Bock,  D.  R.,  Mislevy.  R.  J.  &  Woodson,  C.  E.  (1982).  The  next  stage  in  educational 
assessment.  Educalional  Res^rcher.  //(3),  4-11,  16.. 

Box,  G.  E.  P.,  &  Tiao,  G.  C.  (1973).  iSayesiatt  inference  in  slalislical  analysis, 
Reading,  MA:  Addison-Wcslcy. 

Campbell,  D.  T.,  &  Fiske,  D.  W.  (1959).  Convergent  and  discriminant  validation  by  the 
niulti-lrait-multimethod  matrix,  rsychological  Bulletin,  56,  81-105. 

Oirroll,  J.  B.  (1945).  The  effect  of  difficulty  and  chance  success  on  correlations 
between  Items  and  between  tests.  Psychotneirika,  26,  347-372. 

Carroll.  J.  B.  (1983).  The  difnciilty  of  a  test  and  its  factor  composition  revisited. 

In  H.  Wainer  &  S.  Mcssick  (cds.).  Principals  of  modem  psychological  measurtmenL 
Hillsdale,  NJ:  Erlbaum.; 

ChristofTersson,  A.  (1975).  Factor  analysis  of  dichotomized  variables.  Psychometrika, 
40,  5-32. 

Chromy.  J.  R.  (1979).  Sequential  sample  selection  methods.  Proceedings  of  the  Section 
on  Survey  Research  Methods f  Anwricati  Statistical  Association,  401-406. 

Cochran.  W.  F.:  (1977).  Sa^npling  techniques.  {3rd  ed,).  New  York:  John  Wiley  &  Sons. 

Coleman,  J.  S..  Campbell.  E.  Q.,  Hobson.  C.  J..  McPartland.  J..  Mood.  A.  M..  Weinfeld.  F.: 
D.&York,  R.  L.  (1966).  Equality  of  education,  SDC  No.  FS5.238:3800I. 
Washington,  DC:  National  Center  for  Educational  Statistics. 

Cook,  L.  L.,  Dorans.  N.  J.,  Eignor,  D.  R.,  A  Petersen.  N  S.  (1985).  An  assessment  of 
the  relationship  between  the  assumption  of  unidimensionality  and  the  quality  ofIRT 
tme-score  equating.  rETS  Research  Report  85-30.)  Princeton,  NJ:  Educational 
Testing  Service. 

Cook,  L.  L.,  &  Eignor,  D.  R.  (i984).  Assessing  the  dimensionality  of  NAEP  reading  test 
items:;  Confirmatory  factor  analysis  of  item  parcel  data.  Paper  presented  at  the 
annual  meeting  of  the  American  Education  Research  Association.  New  Orleans,  April 
1984. 

Dempster,  A.  P.,  Laird.  N.:  M..  &  Rubin.      B.  (1977).  Maximum  likelihood  froni 
incomplete  data  via  the  EM  algorithm.  Journal  of  the  Royal  Statistical  Society., 
Series  D,  39,  1-38. 


754 


ERIC 


799 


Drasgow,  F.,  A  Parsons.  C.  K.  (1983).  Application  of  unidimensional  item  response 
theory  models  to  multidimensional  data.  ApplieH  Psychoiogicai  McastirenicnL  7. 
189-199. 

Education  Commission  of  tlie  States  (1980).  Wriiing  achicvcmciu,  1969-79.  restths  from 
the  third  naiional  writing  assessment::  Volume  I  -  /  l-year-olds,  volume  2  - 
13-year-olds,  volume  3  -  9-year'Olds,  Denver.  CO:  National  Assessment  of 
Educational  Progress. 

Educational  Testing  Service  (1983)..  ET5  stoftdards  for  quality  and  fairness.  Princeton. 
NJ:  Author. 

Educational  Testing  Service  (1984).  F4STAT.  Version  2.7  {Computer  program]. 
Princeton,  NJ:  Author. 

Felieggi,  I.  P.  (1979).  Approximate  tests  of  independence  and  goodness  of  fit  based  on 
stratified  multi-stage  samples.  Survey  Methodology,  4.  29-56. 

Glaser.  R.    (1963).  Instructional  technology  and  the  measurement  of  learning  outcomes: 
Some  questions.  American  Psychologist.  / 0.519-521. 

Goldstein.  H.  (1980).  Dimensionality,  bias,  independence  and  measurement  scale  problems 
in  latent  trait  tcsx  score  models.  British  Journal  of  Mathematical  and  Statistical 
Psychology.  33.  234-246. 

Goldstein.  H.  and  James.  A.  (1983).:  Efficient  estimation  for  a  multiple  matrix  sample 
design.  British  Journal  of  Matheuwtical  and  Statistical  Psychology.  36.  167-174. 

Gutlman.  L.  (1941).  The  quantification  of  a  class  of  attributes:  A  theory  and  method 
of  scale  construction.  In  P.  Horst  et  al..  (Eds.)  77r^  prediction  of  persottal 
adjustment,  (pp.  319-348).  New  Ycric:  Social  Science  Research  Council. 

Gutlman.  L.  (1953).  Image  theory  for  the  structure  of  quantitative  variates. 
Psychometrika.  18.  277-296. 

Gutlman.  L.  (1954).  A  new  approach  to  factor  analysis:.  The  radex.  In  P..  P.:  Lazarsfeld 
(Ed.).  Mathematical  thinking  in  the  social  sciences  (pp.  258-348).  Glencoe.  IL:  The 
Free  Press. 

Haherman.  J.  S.  (1977)..  Log-linear  models  and  frequency  tables  with  small  expected  cells 
counts.  Annals  of  Statistics.  5.  1 148-1 169. 

Haertel.  E.  (1984).  An  application  of  latent  class  models  to  assessment  data.  Applied 
Psychological  Measurement.  H.  333-346. 

Hambleton.  R.  K..  &  Rovinelli.  R.  J.:  (1^86).  Assessing  (he  dimensionality  of  a  set  of 
test  items.  Applied  Psychological  Measurement.  10.  287-302. 

Hansen.  M.  H..  Hurwitz.  W.  N.,  &  Madow.  W.  G.  (1953)..  Simiple  survey  methods  and  theory. 
Volunwl.  New  York:  Wiley 


755 

800 


Hansen,  M.  H.,  Tepping,  B.  S.«  Lago,  I.  A.  &  Burke.  J.  (1984).  National  Assessment  of 
Edi0caiional  Progress  (NAEH^-the  sample  and  data  colleaioti  design  for  Year  15.  Paper 
presented  ai  the  meeting  of  the  ASA,  I9S4. 

Harris,  C.  W.  (1962).  Some  Rao-Guttman  relationships.  Psycliometrika,  27.  247-263. 

Hattie,  J.  (I9S4).  An  empirical  rtudy  of  various  indices  for  determining 
unidimensionaiity.  Multivartate  Behayioral  HesearcK  .'9,49-78. 

Haltie,  J.  (1985).  Meihodolofy  review:  Assessing  unidimensionaiity  of  tests  and  items. 
Applied  l^sycMogical  Measurement.  9,  139-164. 

Hendrickson,  A.  E.,  A  While,  P.  O.  (1964).  PROMAX:  A  quick  method  for  rotation  to 
obiic^  simple  structure.  British  Journal  of  Siaiistical  Psychology,  /  7,  65-70. 

Hertzog,  T.,  A  Rubin,  D.  B.  (1983).  Using  multiple  imputation  to  handle  nonresponse  in 
sample  surveys.  In  W.  G.  Madow.  I.  OIkin,  A  D.  B.  Rubin  (Eds.).  Incomplete  data  in 
sample  sun^.  Volume  II:  Theory  and  bibliographies.  New  York:  Academic  Press. 

Holland,  P.  W.  (1981).  When  are  itfm  response  models  consistent  with  observed  data? 
Psychometrika,  46,  79*92. 

Holland,  P.  W.,  A  Rosenbaum,  P.  R.  (In  press).  Conditional  association  and 
unidimeosiooality  in  monotone  latent  variable  models.  Attnals  of  Statistics. 

Holland,  P.  W.,  A  Zwick,  R.  (1986).  NAEP  scaled  scores.  Memorandum  to  A.  E.  Beaton, 
February  I3«  1986. 

Huber,  P.  J.  (1981).  Robust  statistics.  New  York:  Wiley. 

Hulin,  C.  L.,  Drasgow,  F.,  A  Parsons.  C.  K.  (1983).  Item  response  theory:  Application  to 
psychological  tneasurement.  Homewood,  IL:  Dow  Jones*Irwin. 

Jacklin,  C.  N.  (1979).  Epilogue.  In  M.  A.  Wmig  and  A.  C.  Petersen  (Eds.),  Sex-related 
differences  in  cognitive  functioning.  New  York:  Academic  Press. 

Johnson,  E.  G.  A  King,  B.  F.  (1986).:  Generalized  xanance  fitttctions  for  a  complex  sample 
survey.  (Technical  Report  No.  87-72).  Princeton,  NJ:  Educational  Testing  Service. 

Jones,  L.  V.,  Burton,  N.  W.,  A  Davenport.  E.  C.  (1982).  Mathematics  achievement  levels  of 
black  and  H'hite  youth.  (Report  No.  165).  Chapel  Hill,  NC:  The  L.  L.  Thurstone 
Psychometric  Laboratory. 

Jungebiut,  A.  (1984).  Assessittg  the  ditmwonality  of  NAEP  reading  test  items:  Linear 
factor  ancdysis  models.  Paper  presented  at  the  annual  meeting  of  the  American 
Educational  Research  Association,  New  Orleans.  April  1984. 

Kaiser,  H.  F.  (I%3).  Image  analysis.  In  C:  W.  Harris  (Ed.),  Problems  in  tneas.  ring 
change,  pp.  156-166.  Madison.  WI:  University  of  Wisconsin  Press. 

Kaiser,  H.  F,  (1970).  A  second  generation  little  jiffy.  Psychotnetrika.  401-415. 


756 


"'01 


Kaiser,  H.  F.  &  Cerny,  B,  A,  (1978).,  Pseudo-images  and  pseudo-anti-images  from  ihe 
pseudo-inverse  of  a  singular  correlation  matrix,  nritisit  Journal  of  Statistical 
Psychology,  99-101, 

Kaiser,  H.  F.,  &  Cerny,  B.  A.  (1979),  Factor  analysis  of  the  image  correlation  matrix. 
Educational  and  Psychological  Measurement,  39,  71 1-714, 

Kelley,  T.:  L.  (1947).  Fundamentals  of  statistics.  Cambridge:  Harvard  University  Press. 

Kingston,  N,  M.,  &  Dorans,  N,  J..  (1981).  The  feasibility  of  using  item  response  iheoty  as 
a  psychometric  model  for  the  GRE  Aptitude  Test.,  (GRE  Board  Professional  Report 
79-12.)  Princeton,  NJ:  Educational  Testing  Service, 

Kirsch,  I.  S.  &  Jungeblut,  A.  (1986).  Literacy.:  Profiles  of  Americans  Young  Adults, 
Final  Report.,  (Report  No.  16-PL-Ol).  Princeton,  NJ:,  National  Assessment  of 
Educational  Progress.^ 

Kish,  L.  (1967).  Survey  sampling.  New  York:;  John  Wiley  &  Sons. 

Kish,  L.  &  Frankel,  M.  R.  (1974),  Inference  from  complex  samples.  Journal  of  the  Royal 
Siaiisiical  Society,  Series  B,  36,  1-22, 

Lago,  J.  A.,  Burke,  J  S.,  Tepping,  B.  J.,  &  Hansen,  M.  H.  (1985).:  Report  on  sample 

selection,  weighting,  and  variance  estimation:  NAEP-Year  15.  Rockville,  MD:,  Westat, 
Inc. 

Lord,  F.  M.  (1962)..  Estimating  norms  by  item-sampling.  Educatioual  and  Psychological 
Measurement.  22,  259-267, 

Lord,  F.  M.  (1974).  Estimation  of  latent  ability  and  item  parameters  when  there  are 
omitted  responses,  Psychomctvika,  39,  247-264. 

Lord,  F,.  M.  (1980),  Applications  of  item  response  theory  to  practical  nesting  problems.. 
Hillsdale,  NJ:  Lawrence  Eribaum  Associate.i.: 

Lord,  F.  M.,  &  Novick,  M.  R,  (1968)..  Statistical  theories  of  mental  test  scores. 
Reading,  MA:  Addison-Wesley. 

Mantel,  N.,  &  Haenszel,  W.  (1959),  Statistical  aspects  of  the  retros|)ective  study  of 
disease.  Journal  of  the  National  Cancer  institute,  22,  719-748, 

Masters.  G.  N.  (1982),  A  Rasch  model  for  partial  credi*  scoring.,  PsycHometrika,  47, 
149-174.: 

McDonald,  R.  P,  (1967).  Nonlinear  factor  analysis,  Piychometnc  Monographs  {No.  15), 

McDonald,  R.;  P,  (1983).  Exporatory  and  confirmatory  nonlinear  common  factor  analysis.  In 
H.;  Waincr  &  S.  Messick  (Eds.),  Principles  of  modern  psychological  measurement. 
Hillsdale,  NJ:  Eribaum. 

McDonald,  R.  P.,  &  Ahlawat,  K.:  S,  (1974),  Difficulty  factors  in  binary  data.,  British 
Journal  of  Mathematical  and  Statistical  Psychology.  27.  82-99., 


757 


802 


Messick.  S.  J.  (1980).  Test  validity  and  the  ethics  of  assessment.  American 
Psychologist.  35.  1012-1027, 

Messick.  S.  J..  Beaton.  A.  E.  &  Lord.  F.  M.  (1983).  NAEI'  reconsidered:  A  new  design  for 
a  new  era,  (NAEP  Report  83-1.)  Princeton,  NJ:  Educational  Testing  Service. 

Mislcvy.  R.  J.  (1984a).  GROUP:  Estimation  of  group  effects  in  univariate  models 
[Computer  program].  Princeton.  NJ:  Educational  Testing  Service. 

Mislcvy.  R.J.  (1984b).  Estimating  latent  distributions.  Psychometrika.  490), 'i59-'i%\. 

Mislevy,  R.  J.  (1985a).  Inferences  about  latent  populations  from  complex  samples.  (ETS 
Research  Report  RR-85-41),  Princeton,  NJ:  Educational  Testing  Service. 

Mislevy.  R.  J.  (1985b).  Estimation  of  latent  group  effects.  Journal  of  the  American 
Statistical  Associafion.  80.  993-997, 

Mislevy.  R.  J.  (1985c).  RESOLVE:  Estimation  of  latent  distributions  by  the  method  of 

tnaxifmim  likelihood  [Compuier  program].  Mooresville.  IN:  Scientific  Software.  Inc. 

Mislevy.  R.  J.  (1986a).  Exploiting  auxiliary  information  about  examinees  h  the 

estimation  of  item  parameters.  (ETS  Research  Report  RR-86<i8).  Princeton.  NJ: 
Educational  Testing  Service. 

Mislevy.  R.  J.  (1986b).  A  Bayesian  treatment  of  latent  variables  in  sample  surveys.  (ETS 
Research  Report  RR>86-1).  Princeton.  NJ:  Educational  Testing  Service. 

Mislcvy.  R.  J.  (1986c).  Recent  developments  in  the  factor  analysis  of  categorical 
variables.  Journal  of  Educational  Statistics.  I  /,  3-31 , 

Mislevy.  R,  J.  (in  progress).  Approximating  secondary  biases  in  the  analysis  of  Year  15 
NAEP  reading  plausible  values. 

Mislcvy.  R,  J..  &  Bock.  R,  D.  (1982),  BILOG:.  Item  r  wlysis  and  test  scoring  with  binary 
logistic  models  [Computer  program].  Mooresville.  IN:  Scientific  Software. 

Mosenthal.  P.  B.  (1985).  An  analysis  of  NAEP  reading  assessment  items.  Unpublished 
manuscript.  Syracuse  University. 

Mosteller.  F.  &  Tukcy.  J.  W,  (1969).  Data  analysis,  including  statistics.  In  G,  Lindzey 
and  E.  Aronson  (Eds.).  Handbook  of  Social  Psychology  (2nd  ed.).  Reading,  MA:; 
Addison-Wesley. 

Mulaik.  S.  A.  (1972).  ne  foundations  of fnctor  analysis.  New  York:  McGraw-Hill, 

Mussen.  P,  H..  Conger.  J.  J.,  &  Kagan.  J.  (1969).  Child  development  attd  personality.  New 
York:,  Harper  and  Row, 

Muth6n.  B.  (1978).  Contributions  to  factor  analysis  of  dichotomous  variables. 
Psychometrika.  43.  551-560. 


758 


mc 


803 


NAEP:  A  proposal  submitted  in  respottse  to  Grant  Annonnccnwnt  No.  PA-82-001:  Technical 
application,  (mi).  Princeton,  NJ;  Educational  Testing  Service. 

Reading  objectives:  1983^84 assessnwu.  (1984).  (NAEP  Report  1 5-RL- 10).  Princeton  NJ- 
National  Assessment  of  Educational  Progress. 

The  Reading  Report  Card:  Progress  to\vard  excellence  in  onr  schools.  (1985).  (NAEP  Report 
15-R-OI).  Princeton,  NJ:  Educational  Testing  Service. 

Reckasc,  M.  D.  (1979).  Unifactor  latent  trait  models  applied  to  multifactor  tests: 
Results  and  implications.  Journal  of  Educational  Statistics,  <  207-230 

Report  on  field  operations  and  data  collection  activities,  NAEP  Year  15  (1984) 
Rockville,  MD:  Westat,  Inc. 

Research  Triangle  Institute.  (1982).   Year  13  field  operatiotts  and  data  collection 

activities.  National  Assessment  of  Educational  Progress.  Research  Triangle  Park  NC* 
Author. 

Rivera,  C.  &  Pennock-Roman,  M.  (1985).  A  comparison  of  race/ethnicity  idemification 
methods  in  the  Nalional  Assessments  of  Educational  Progress.  Paper  presented  at  the 
annual  meeting  of  the  AERA,  1985. 

Roscnbaum,  P.  R.  (1984a).  Testing  the  conditional  independence  and  monotonicity 
assumptions  of  item  response  theory.  Psychometrika,  49.  425-435. 

Roscnbaum,  P.  R.  (19846),  Are  the  item  responses  of  two  groups  of  examinees  consistent 
with  a  difference  in  the  distribution  of  a  unidimensional  latent  variable?  (Program 
Statistics  Research  Technical  Report  No.  84-51).  Princeton,  NJ:  Educational  Testinis 
Service.  ^ 

Rubin,  D.  B.  (1977).  Formalizing  subjective  notions  about  the  effects  of  nonresponse  in 
sample  surveys.  Jounml  of  the  Americas}  Statistical  Association,  71,  538-543, 

Rubin,  D.  B.  (1978).  Multiple  imputations  in  sample  sui^veys.  Proceedings  of  the  Snr\^ey 
Research  Methods  Section  of  the  American  Statistical  Association,  20-34. 

Satterthwaite,  F.  E.  (1946)..  An  approximate  distribution  of  estimates  of  variance 
components.  Biometrics,  2,  110-114. 

Shah,  B.  V   Holt,  M.  M..  &  Folsom,  R.  E.  (1977).  htference  about  regression  models  from 
sample  survey  data.  Research  Triangle  Park,  NC:  Research  Triangle  Institute. 

Shechan  K.M.  (1985).  M^GROUP:  Estimation  of  group  effects  in  multivariate  models 
[Computer  program].  Princeton,  NJ:  Educational  Testing  Service, 

Stocking,  M,  L.,  &  Lord,  F.  M.  (1983),  Developing  a  common  metric  in  item  response 
theory.  Applied  Psychological  Mecsnrement,  7,201-210, 

Stout.  W.F.«  (1984).  The  statistical  assessment  of  latent  trait  dimensionality  in 

psychological  testing.  (ONR  Report)..  Urbana-Champaign,  IL:  Department  of 
Mathematics,  University  of  Illinois. 


759 


804 


Tcpping,  B.  J,  &  H«in<;en,  M.  H.;  (1984).    Estimolian  of  papulation  cowrionccs.  Unpublished 
memorandum. 

Traub.  R.  E.,  &  Wolfe,  R.  G.  (1981).:  Latent  trait  theories  and  the  assessment  of 
educational  achievement.    Review  of  Researcli  in  Educatioiu  9,  377-435.. 

Tukey,  J.  W.  (1977).  Exploratory  data  analysis.  Reading,  MA::  Addison-Wesley. 

Wingersky,  B.  (1984).  Gramianizing  matrices.  Unpublished  memorandum.: 

Wingersky,  M.  S.  (1983),  LOGIST:  A  program  for  computing  maximum  likelihood  procedures 
for  logistic  test  models.  In  R,  K.  Hambleton  (Ed.),  Applications  of  item  response 
theory.  Vancouver,  BC:  Educational  Research  Institute  of  British  Columbia. 

Wingersky,  M.  S.  (1984).  MLE-ABIL:  Maximum  likelihood  estimates  of  ability  [Compuitr 
program],  Princeton,  NJ;  Educational  Testing  Service. 

Wingersky,  M.  S.  (1986).  Joint  estimation  procedures.  Year  15  NAEP.^  Internal  technical 
report.  Princeton,  NJ:  Educational  Testing  Service. 

Wingersky,  M.  S.,  Barton,  M,  A.  and  Lord,  F.  M,  (1982)..  LOGIST  V  user's  guide., 
Princeton,  NJ:  Educational  Testing  Service. 

Wilson,  D.,  Wood,  R.  L.,  &  Gibbons.  R.  (1983),  TESTFACT:  Test  scoring  ai\d  item  factor 
atialysis  [Computer  program],  Chicago:  Scientific  Software. 

Wolter,  K.  M.  (1985).  Introduction  to  variance  estinmtion.  New  York:;  Springer-Verlag. 

Writing  objectives:  1983-84  assessment.  (1982).  (NAEP  Report  I5-W-10).  Princeton,  NJ; 
National  Assessment  of  Educational  Progress, 

Zwick,  R.  (1985).  Bias  of  variance  and  covariatwe  estimates  in  NAEP.  Unpublished 
memorandum.: 

Zwick,  R.  (1986a).  Assessment  of  the  dimensionality  of  the  NAEP  year  15  reading  data. 
(ETS  Research  Report  RR-86-4).  Princeton,  NJ:  Educational  Testing  Service. 

Zwick,  R.  (1986b).  Effects  of  reader  knowledge  of  student  age  on  NAEP  writing  scenes. 
Internal  technical  report.  Princeton,  NJ:  Educational  Testing  Service. 


760 


SUBJECT  INDEX 


IMPLEMENTING  THE  NEW  DESIGN; 
THE  NAEP  1983-84  TECHNICAL  REPORT 


SUBJECT  INDEX 


Almanacs,  13.4 
format,  13.4.2 
interpretation  of,  13.4.3 
types,  13.4.2. 1-13.4.2.8 

Anchoring  scale  points,  10.5.2 

ARM.  See  Average  response  method 

Assessment  design,  2.1-2.9 

assigning  items  to  students,  2.5,  5-5.8 

considerations,  5.1 

database  creation,  2.8,  8,  8.6 

data  entry  system,  8.3 

editing  data,  8.4 

field  administration,  2.7,  7-7.4 

instruments  and  items,  2.6,  6-6.4 

processing  assessment  material,  8, 

8.0.1,  8.1 
professional  scoring,  8.2 
public-use  data  tape  construction,  8.7 
quality  control,  8.5 

reading  assessment  development,  3,  3.2 
sample  selection  and  instrument 

collection,  2.4,  4-4.8 
tabular  summary,  2.9 
writing  assessment  development,  3,  3.1 

Assessment  items,  2.6,  6-6.4  Sec  also 
Reading  items.  Writing  items. 
Background  and  attitu Je  items, 
assembling  into  blocks,  6.1.1 
assigning  to  students,  2.5,  5-5.8 

Assigning  items  to  students,  2.5,  5-5.8 
achieved  samples,  5.7 
advantages  and  disadvantages  of  spiral 

design,  5.8 
BIB  spiral  sample,  5.2 
considerations  in  design,  5.1 
pairings  of  item  blocks,  5.4 
spiral  design,  5.5 
tape  sample,  5.6 
UBIB  spiral  sample,  5.3 


Average  response  method.  II. 4 
application  to  writing  data,  1 1 .4.4 
bias  of  estimator,  II. 4. 3 
method,  1 1.4. 1 
plausible  values,  1 1.4.2 

Background  and  attitude  data  analysis, 
9.3,  12-12.2 

Background  and  attitude  items.  6.1.4 
assembly  into  instruments,  6  I,  6.1.4 
data,  in  almanacs,  13.4.2 
and  reporting  subgroups,  12-12.2 
specific  to  writing,  3.1.6 
use  in  validity  assessment,  14.1.2.3-4 

Batching  effect,  11.1.2 

Bias,  item 

prior  knowledge,  3.1.4.2 

review  of  reading  items  for,  3.2.3.4 

Bias,  statistical 

of  ARM  estimator,  1 1 .4.3 
effects  of  specification  errors  on 
plausible  values,  10.3.5 

BIB/pace  equating,  10.3.6 

BIB  spiralling.  Sec  ^Iso  BIB/UBIB 
spiralling 
design,  5.2 

impact  on  dimensionality  analyses, 
10.1.4 

BIB/UBIB  spiralling.  2.5.  5-5.8 
advantages  and  disadvantages,  5.8 
BIB  sample.  5.2 
booklets,  2.5. 1 
pairings  of  item  blocks,  5.4 
UBIB  sample,  5.3 

BILOG.  10.3 


ERLC 


763 


807 


Blocks 

assembly  of  items  into.  6.1 
pairings  in  spiral  design,  5.4 

Booklets 

assembly  of  blocks  into,  5.2,  5.3,  6.1 
spiral,  2.5.1 
tape,  2.5.2 

Catalog 

data  file,  8.7.5 
machine-readable,  8.7.8 
master,  8.6.3 

Clustering,  accounting  for  effects  of, 
13.2.2 

Cognitive  item* .  See  Reading  items. 
Writing  icems 

Construct  validity,  14.1.2 

Consultants 

reading  items,  3.2.5 
writing  items,  3.1.8 

Content  validity,  14.1.1 

Data  analysis.  9-14.2 

background  and  attitude,  12-12.2 
parameter  estimation,  13-13.4 
reading,  10-10.5 
supplementary  studies,  14-14.2 
writing,  II. 0-11. 4 

Database  creation,  2.8,  8,  8.0.2,  8.6 
extraction,  8.6.1 
file  merging,  8.6.2 
nr'^ter  catalog,  8.6.3 

Data  entry,  8.3 

audit  trail  processing,  8.3.8 
data  spooling,  8.3. II 
forms.  8.3.7 

machine  considerations,  8.3.2 
organization  and  structure,  8.3.3 
of  professional  scoring,  8.2.2.7 
program  structure,  8.3.4 
of  questionnaires,  8.3.9 
if  school  worksheet,  8.3.5 
of  student  data,  8.3.6 
system  requirements,  8.3.1 


Derived  variables,  12. 1-12.2 
age.  12.1.7 

imputed  race/ethnicity.  12.1.3 
parental  education.  12.1.6 
and  reporting  subgroups,  12.1 
WARM,  12.2 

Design  effects,  14.2 

from  reading  assessment,  14.2.1 

Dimensionality,  10.1 

impact  of  BIB  spiralling  on.  10.1.4 
methods  for  dichotomous  data,  10.1.2 
methods  for  reading  data,  10.1.3 

Editing  data.  8.4 

professionally  scored  items,  8.4.2 
student  and  questionnaire  data,  8.4.1 

Equating,  BIB/pace.  10.3.6 

Excluded  3tudent 
questionnaire,  6.2 

data  entry.  8.3.9 

editing  data,  8.4. 

quality  controi,  8.5.2 

processing,  8.1.5 
sample.  4.6 

Field  administration,  2.7,  7-7.4 
assessment  activities,  7  3 
assessment  materials  and  forms,  7.4.2 
conduct  of  the  assessments,  7.3.2 
drawing  the  sample,  7.3.1 
field  management,  7.4 
monitoring  field  activities,  7.4.1 
pre-assessment  activities,  7.2 
pre-assessment  materials  and  forms. 
7.4.3 

quality  control  and  evaluation,  7.3.5 
schedule  of  field  activities,  7.1 
soliciting  cooperation  of  districts  and 

schools,  7.2.3 
students  sampled,  invited  and  assessed, 

7.3.3 

training  district  supervisors,  7.2.2 
training  exercise  administrators,  7.2.4 

Field  testing  of  reading  items,  3.2.3.5, 
3.2.3.7 


ERLC 


764 


808 


Forms 

for  data  entry.  8.3.7 

and  materials  for  assessment,  7.4.2 

and  materials  for  prc-asscssment.  7.4.3 

F4STAT,  9.4 

Grade/age.  reporting  subgroup  definition. 
12.1,7 

Imputation,  estimation  of  variability  due 
to,  13.3.  See  also  Plausible  \a\ws 

Instruments,  2.9.1.  2.6.  6.  6.1  See  also 
Reading  items.  Writing  items. 
Background  and  attitude  items. 
Questionnaires 

development,  2.3 

processing,  8.1.4 

IRT.  See  Item  response  tlicory 

Item  response  tlieory,  10.0.1 

and  educational  assessment.  10.0.2 

Jackknife 

alternative  estimatv/rs.  13.2.5 
estimation  of  variability,  13.2.3 

Joint  estinnation  procedures.  10.2 
item  parameter  calibration,  10.2.2 
maximum  likeliliood  estimates  of 

proficiency.  10.2.3 
method,  10.2.1 

LOGIST,  10.2 

Marginal  estimation  procedures.  10.3 
BIB/pacc  equating.  10.3.6 
conditional  effects  estimate,  10.3.3 
effects  of  specification  errors  on 

plausible  values.  10.3.5 
general  model,  10.3.1 
generation  of  plausible  values,  10.3.4 
item  parameter  estimation,  10.3.2 
plausible  values,  10.3.1. 1 

Mode  of  administration,  effect  on 

estimates  of  wkiting  performance,  11. 2 

Non-cognitive  items.  See  Background  and 
attitude  items 


ERIC 


Nonrcsponsc.  adjustment  for.  13.1.2 
school.  13  J. 3 
student.  i3.1.4 

students  in  spiral  sessions.  !3.l.6 
students  in  tape  sessions.  13. 1.5 

Objectives 
reading.  3.2. 1 

development.  3.2.2 
writing.  3.  I.I 

development,  3. 1.2 

Parameter  estimation.  9.4.  13-13.4 
uncertainty  due  to  sampling  variability. 
13.2 

variability  due  to  imputation,  13.3 
use  of  almanacs.  13.4 
weighting  procedures,  13.1 

Parental  education,  reporting  subgroup 
definition.  12. 1.6 

Participation  re5ults,  student.  4.7 

Plausible  values.  10.3.1. 1 

and  average  response  method.  1 1 .4.2 
erfects  of  specification  errors  on. 

10.3.5 
generation  of 

in  reading  analysis.  10.3.4 

in  reading  trend  analysis.  10.4.5 

Post-stratification.  13.1.9 

Pre-assessment 
activities.  7.2 

materials  and  forms.  7.4/^ 

Primary  sampling  units.  4.1 
sample  characteristics.  2.9.2 

Processing  assessment  materials.  8.  8.0.1. 
8.1 

administration  schedules.  8.1.2 
professionally  scored  items.  8.1.4.2 
questionnaires.  8. 1 .5 
recnpt  of  materials,  8. 1 . 1 
school  worksheets.  8. 1.3 
student  instruments,  8.1.4 

Professional  scoring,  8.2 
data  entry.  8.2,2.7 


765 


809 


Profiessional  scoring  {continued) 
description,  8.2.1 
editing  data,  8.4.2 
operation,  8.2.2 
processing  data,  8.1.4.2 
reliability  and  resolution,  8.2.2.6 
scorers,  8.2.2.1 
training,  8.2.2.2-8.2.2.4 
types 

holistic,  8.2.1.2 

mechanics.  8.2.1.3 

primary  trait,  8.2.1.1 

Proficiency 

of  American  students,  15-15. 1 
and  construct  validity,  14.1.2.1 
data,  in  almanacs.  13.4.2.3-13.4.2.6 
maximum  likelihood  estimates  of,  10.2.3 

PSAT  scores  and  construct  validity, 
14.1.2.3,  14.1.2.4 

PSU.  See  Primary  sampling  units 

Pvblic-use  data  tape  construction,  8.7 
codebooks,  8.7.6 
data  definition,  8.7.3 
data  file  catalogs,  8.7.5 
data  file  layouts,  8.7.4 
file  definition,  8.7.1 
machine-readable  catalog  files,  8.7.8 
SAS  and  SPSS-X  control  files,  8.7.7 
variable  definition,  8.7.2 

Quality  control,  8.5 

during  field  administration,  7.3.5 
of  questionnaire  data 

excluded  student,  8.5.2 

school  characteristics  &  policy,  8.5.4 

teacher,  8.5.3 
of  student  data,  8.5.1 
summary,  8.5.5 

Questionnaires 
data  entry,  8.3.9 
editing  data,  8.4.1 
excluded  student,  6.2 
processing  data,  8. 1 .5 
quality  control,  8.5.2-8.5.4 
school  characteristics  and  policy,  6.4 
teacher,  6.3 


Race/ethnicity,  reporting  subgroup 
definition 
imputed,  12.1.3 
observed,  12.1.2 

Reading  data  analysis.  9.1,  10-10.5 
dimensionality,  10. 1 
item  response  theory,  10. 0.1,  10.0.2 
estimation  procedures 

joint.  iO.2 

marginal,  10.3 
scaling.  10.5 
trend,  10.4 

Reading  items,  3.2.4,  6.1.2 
assembling  into  blocks,  6.1.1 
bias  review,  3.2.3.4 
data,  in  almanacs,  13.4.2 
design  effects,  14.2.1 
development,  3,  3.2 

consultants,  3.2.5 
field  testing,  3.2.3.5,  3.2.3.7 
objectives,  3.2.1 

development,  3.2.2 
and  student  instruments,  6.1 
use  in  trend  analysis,  10.4.2 

Reading  scale,  10.5 

anchoring  scale  points,  10.5.2 

Reading  trend 
analysis,  10.4 

data,  in  almanacs,  13.4.2.6-13.4.2.7 

estimation  of,  10.4.1 

conditional  effects  for,  10.4.4 
item  parameters  for,  10.4.3 

generation  of  plausible  values  for, 
10.4.5 

selection  of  items  for,  10.4.2 

Region,  reporting  subgroup  definition, 
12.1.5 

Reliability 

intei  -rater  (writing  data).  1 1 . 1 . 1 
professional  scoring,  8.2.2.6 

Reporting  subgroups,  12-12.2 
definition  of 

grade/age.  12. 1.7 

imputed  race/ethnicity.  12.1.3 

observed  race/ethnicity.  12.1.2 


ERIC 


766 

810 


w 


Reporting  subgroups  (continued) 
parental  education.  12.1.6 
region,  12.1.5 
sex,  12. 1. 1 

size  and  type  of  community.  12. 1 .4 
and  derived  variables,  12.1 

Sample  selection.  2.4.  4-4.8 
excluded  student  sample,  4.6 
primary  sampling  units 

characteristics,  2.9.2 

sample,  4.1 
schools 

assignment  of  sessions  to,  4.4 

characteristics,  2.9.2 

initial  sample,  4.2 

updating  the  sample.  4.3 
students 

characteristics,  2.9.3 

participation  results,  4.7 

samples,  4.5 
tabular  summary.  2.9 
teacher 

characteristics,  2.9.2 

sample.  4.8 

SAS  control  files.  8.7.7 

Scale.  5tv  Reading  scale.  Writing  scale 

Schedule 

of  assessments,  2.2 
of  field  activities,  7.1 

Schooh 

assignment  of  sessions  to,  4.4 

characteristics  and  policy  question- 
naire. 6.4 
data  entry,  8.3.9 
editing  data.  8.4.1 
processing  data,  8.1.5 
quality  control,  8.5.4 

initial  sample,  4.2 

sample  characteristics,  2.9.2 

updating  the  sample.  4.3 

Sex,  reporting  subgroup  definition,  I2.1.  i 

Size  and  type  of  community,  reporting 
subgroup  definition,  12.1.4 

Spiralling.       BIB/UBIB  spiralling 


SPSS-X  control  files,  8.7.7 

Stratification,  accounting  for  effects  of. 
13.2.2 

Students 

assignment  of  items  to.  5-5.8 
assessment  data 

editing.  8.4.1 

entry.  8.3.6 

processing.  8.1 .4 

quality  control,  8.5.1 
assessment  instruments,  6.1 
associated  teacher  sample.  4.8 
drawing  the  sample  of,  7.3.1 
participation  results,  4.7,  7.3.3 
sample  characteristics.  2.9.3 
samples  of,  4.5 

Supplementary  studies.  9.5.:  14-14.2 
design  effects.  14.2 
validity  issues.  14.1 

Systematic  selection,  accounting  for 
effects  of,  13.2.2 

Tape 

booklets.  2.5.2,  5.2,  5.3.  6.1 
sample,  5.6 

Teachers 

questionnaire.  6.3 

data  entry,  8.3.9 

editing  data,  8.4.1 

processing.  8.1.5 

quality  control,  8.5.3 
sample.  4.8 

sample  characteristics,  2.9.2 

Training 
district  supervisors,  7.2.2 
exercise  administrators,  7.2.4 
professional  scorers 

holistic.  8.2.2.3 

mechanics.  8.2.2.4 

primary  trait.  8.2.2.2 

Trend.  See  Reading  trend.  Writing  trend 

UBIB  spiral  design.  5.3.  See  also 
BIB/UBIB  spiralling 


767 


811 


Validity,  14.1 
construct.  1 4. 1. 1 
content,  14.1.2 


Writing  trend 

data,  in  almanac^;.  13.4.2.8 
estimation  of,  1 1.3 


issues,  14.1 

summary  of  evidence,  14. 1 .3 

WARM  variables,  12.2 

Weighted  average  response  method  vari- 
ables,  12.2 

Weighting,  13.1 
adjustments 

for  missing  tape  session,  13.1.7 
for  nonresponse,  13.1.2 
for  school  nonresponse,  13.1.3 
for  student  nonresponse,  13.1.4 
for  student  nonresponse,  spiral  ses- 
sions, 13.1.6 

for  student  nonresponse,  tape  ses- 
sions, 13.1.5 

computation  of  base  weight,  1 3. 1 . 1 

final  student  weight,  13. 1. 10 

full-sample  weight,  13. 1. 10 

post-stratification,  13.1.9 

procedures,  13.1 

trimming  to  reduce  mean  squared  error. 
13.1.8 

Writing  data  analysis.  9.2.  1 1  - 1 1 .4 
average  response  method  of  scaling,  1 1 .4 
batching  effect,  1 1. 1.2 
effect  of  mode  of  administration,  1 1 .2 
inter-rater  reliability,  1 1 . 1 . 1 
trend,  1 1.3 
writing  data.  I  I.I 

descriptive  statistics,  11.1.3 

Writing  items,  3.1.5,  6.1.3 
assembling  into  blocks,  6. 1 . 1 
background  and  attitude.  3.1.6 
data,  in  almanacs,  13.4.2 
development,  3,  3.1 

consultants,  3.1.8 
objectives,  3.1.1 

development,  3.1.2 
and  prior  knowledge  bias.  3.1.4.2 
and  student  instruments,  6.i 
and  types  of  writing  tasks,  3. 1 .4. 1 
use  in  trend  analysis,  1 1.3 

Writing  scale,  1 1 .4.,  Sec  also  Average 
response  method 


768 

812 


9066402  *  S37Pj6  *  249970  •  PrinlidbiUSA 


813 


